Thanks for the examples. Personally, I have been struck out multiple times
by data frames dropping dimensions, so I have a distaste for this dropping
behaviour.
Personally, I prefer data frame *not* to drop dimensions. They are not
arrays, where slicing drops a dimension makes sense because all entries are
same data type.
You can pull out a column in vector form from both tribbles and data frame
with the $ index; subsetting a row from a data frame and forcing it into an
atomic vector will require cast all columns to lowest common denominator,
often character.
So I would argue that yes, tribbles are data.frame with extra bells and
whistles, even if I do not understand the use of list columns.
I suggest a defensive coding technique; if you need a data frame subset to
really be a vector, cast it as a vector. Users *will* attempt to throw
unexpected structures at your methods. When your methods fails in
mysterious ways because it didn't extract a vector, users will be
stupefied. Fail at `as.vector` will indicate why.
Kindly,
Stefan
Stefan McKinnon H?j-Edwards
ph.d. Genetics
+44 (0)776 231 2464 <+44%207762%20312464>
+45 2888 6598 <+45%2028%2088%2065%2098>
Skype: stefan_edwards
2017-09-26 10:05 GMT+01:00 Joris Meys <Joris.Meys at ugent.be>:
Here's one difference:
atib <- tibble(a = 1:5, b = letters[5:1])
atib[3,"a"]
as.data.frame(atib)[3,"a"]
The second line returns a tibble (no dropping dimensions), the third line
does (dropping dimensions). Huge difference if you use [ , aColumn] to
select a vector from a data frame.
Cheers
Joris
On Tue, Sep 26, 2017 at 10:57 AM, Stefan McKinnon H?j-Edwards <
sme at iysik.com> wrote:
Hi G?ran,
Could you please elaborate on which kind of subsetting that Hadley
dislikes?
I am yet to encounter operations on data frames that are not possible on
tribbles.
Kindly,
Stefan McKinnon Hoj-Edwards
Stefan McKinnon H?j-Edwards
ph.d. Genetics
+44 (0)776 231 2464
+45 2888 6598
Skype: stefan_edwards
2017-09-26 8:30 GMT+01:00 G?ran Brostr?m <goran.brostrom at umu.se>:
I am beginning to get complaints from users of my CRAN packages
(especially 'eha') to the effect that they get error messages like
Unsupported use of matrix or array for column indexing".
It turns out that they are sticking in tibbles into functions that
data frames as input. And I am using the kind of subsetting that Hadley
dislikes (eha is an old package, much older than tibbles). It is of
a simple matter to change the code so it handles both data frames and
tibbles correctly, but this affects many functions, and it will take
time. And when the next guy introduces 'troubles' as an improvement of
'tibbles', I will have to rewrite the code again.
While I like Hadley's way of doing it, I think it is a mistake to let a
tibble also be of class data frame. To me it is a matter of
backwards compability: A tibble should add nice things to a data
change basic behaviour, in order to call itself a data frame.
Is it correct to let a tibble be of class "data.frame"?
G?ran Brostr?m