Skip to content
Prev 54253 / 63424 Next

split() - unexpected sorting of results

Hi,
On 10/20/2017 12:53 PM, Peter Meissner wrote:
Maybe a little surprising, but no more than:

 > x <- sample(11L)

 > sort(x)
  [1]  1  2  3  4  5  6  7  8  9 10 11

 > sort(as.character(x))
  [1] "1"  "10" "11" "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"

The fact that sort(), as.factor(), split() and many other things behave
consistently with respect to the underlying order of character vectors
avoids other even bigger surprises.

Also note that the underlying order of character vectors actually
depends on your locale. One way to guarantee consistent results across
platforms/locales is by explicitly specifying the levels when making
a factor e.g.

   f <- factor(x, levels=unique(x))
   split(1:11, f)

This is particularly sensible when writing unit tests.

Cheers,
H.