When split's x argument has a class attribute and the
grouping vector, f, is shorter than x then split gives
the wrong result. It appears to not extend f to the length
of x before doing the split. E.g.,
> split(factor(letters[1:3]), "Group one") # expect all 3 elements in
the single group
$`Group one`
[1] a
Levels: a b c
> split(factor(letters[1:3]), c("Group one", "Group two")) # expect
warning and Group one should contain "a" and "c".
$`Group one`
[1] a
Levels: a b c
$`Group two`
[1] b
Levels: a b c
We expect the above to act like the similar cases where x is
a character vector
> split(letters[1:3], "Group one")
$`Group one`
[1] "a" "b" "c"
> split(letters[1:3], c("Group one", "Group two"))
$`Group one`
[1] "a" "c"
$`Group two`
[1] "b"
Warning message:
In split.default(letters[1:3], c("Group one", "Group two")) :
data length is not a multiple of split variable
We get a similar problem for other stray classes of x
> split(structure(letters[1:3],class="no sUch cLaSs"), c("Group one",
"Group two"))
$`Group one`
[1] "a"
$`Group two`
[1] "b"
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
split(factor, shortGroupVector) gives incorrect results in R 2.12.2
2 messages · William Dunlap, Peter Dalgaard
On Mar 21, 2011, at 17:16 , William Dunlap wrote:
split(factor(letters[1:3]), c("Group one", "Group two"))
Yes, that's a bug (at the very least, it is against documented behavior)
The strong suspicion is that
ind <- .Internal(split(seq_along(f), f))
should have seq_along(x) , not f. But would that break for other reasons?
(It would! Surv() objects to name one case. In general, we seem to be in trouble if "[" and length() methods are not compatible.)
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com