Skip to content
Prev 170429 / 398503 Next

tapply bug? - levels of a factor in a data frame after tapply are intermixed

Hello! I have encountered a really weird problem. Maybe you've
encountered it before?
I have a large data frame "importances". It has one factor ($A) with 3
levels: 3, 9, and 15. $B is a regular numeric variable.
Below I am picking a really small sub-frame (just 3 rows) based on
"indices". "indices" were chosen so that all 3 levels of A are
present:

indices=c(14329,14209,14353)
test=data.frame(yy=importances[["B']][indices],xx=importances[["A"]][indices])
Here is what the new data frame "test" looks like:

            yy        xx
1 -0.009984006  9
2 -2.339904131  3
3 -0.008427385 15

Here is the structure of "test":
'data.frame':   3 obs. of  2 variables:
 $ yy: num  -0.00998 -2.3399 -0.00843
 $ xx: Factor w/ 3 levels "3","9","15": 2 1 3

Notice - the order of factor levels for xx is not 1 2 3 as it should
be but 2 1 3. How come?

Or also look at this:
[1] 9  3  15
Levels: 3 9 15

Same thing.
Do you know what might be the reason?

Thank you very much!