Skip to content
Prev 365905 / 398502 Next

Odd behaviour of mean() with a numeric column in a tibble

{{SIGH}} 

You are absolutely right. 

I wonder if I am losing some cognitive capacities that are needed to be part of the evolving R community. It seems to me that if a tibble is designed to be an enhanced replacement for a dataframe then it shouldn't quite so radically change things. 

I notice that the documentation on tibble says "[ Never simplifies (drops), so always returns data.frame" 
That is much less explicit than I would have liked and actually doesn't seem to be true. In fact, as you rightly say, it generally, but not quite always, returns a tibble. In fact it can be fooled into a vector of length 1.
Error in `[[.data.frame`(tmpTibble, 1, ) : 
argument "..2" is missing, with no default
# A tibble: 26 ? 1 
ID 
<chr> 
1 a 
2 b 
3 c 
4 d 
5 e 
6 f 
7 g 
8 h 
9 i 
10 j 
# ... with 16 more rows
# A tibble: 26 ? 1 
ID 
<chr> 
1 a 
2 b 
3 c 
4 d 
5 e 
6 f 
7 g 
8 h 
9 i 
10 j 
# ... with 16 more rows
Error in `[<-.data.frame`(`*tmp*`, , value = list(ID = c("a", "a", "a", : 
replacement element 3 is a matrix/data frame of 26 rows, need 1 
In addition: Warning messages: 
1: In `[<-.data.frame`(`*tmp*`, , value = list(ID = c("a", "a", "a", : 
replacement element 1 has 26 rows to replace 1 rows 
2: In `[<-.data.frame`(`*tmp*`, , value = list(ID = c("a", "a", "a", : 
replacement element 2 has 26 rows to replace 1 rows
Error: Invalid column indexes: 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26
[1] 1
int 1
Error in col[[i, exact = exact]] : 
attempt to select more than one element in vectorIndex
[1] "b"
So [[a,b]] works if a and b are legal with the dimensions of the tibble and if a is of length 1 but returns NOT a tibble but a vector of length 1 (I think), I can see that's logical but not what it says in the documentation. 

[[a]] and [[,a]] return the same result, that seems excessively tolerant to me. 

[[a,b:c]] actually returns [[a,c]] and again as a single value, NOT a tibble. 

And row subsetting/indexing has gone. 

Why create replacement for a dataframe that has no row indexing and so radically redefines column indexing, in fact redefines the whole of indexing and subsetting? 

OK. I will go to sleep now and hope to feel less dumb(ed) when I wake. Perhaps Prof. Wickham or someone can spell out a bit less tersely, and I think incompletely, than the tibble documentation does, why all this is good. 

Thanks anyway Ista, you certainly hit the issue! 

Very best all, 

Chris