Ordering long vectors
On Sat, 7 Jun 2003, G?ran Brostr?m wrote:
I need to order a long vector of integers with rather few unique values. This is very slow:
x <- sample(rep(c(1:10), 50000)) system.time(ord <- order(x))
[1] 189.18 0.09 190.48 0.00 0.00 But with no ties
y <- sample(500000) system.time(ord1 <- order(y))
[1] 1.18 0.00 1.18 0.00 0.00 it is very fast! This gave me the following idea: Since I don't care about keeping the order within tied values, why not add some small disturbance to x, and indeed,
unix.time(ord2 <- order(x + runif(length(x), -0.1, 0.1)))
[1] 1.66 0.00 1.66 0.00 0.00
An even better way is
system.time(ord3 <- order(x + seq(0, 0.9, length = length(x))))
[1] 1.32 0.05 1.37 0.00 0.00 Faster, but more important; it keeps the original ordering for tied values. Thanks to James Holtman. G?ran [...]