Ordering long vectors
On Mon, 9 Jun 2003, Prof Brian Ripley wrote:
On Sun, 8 Jun 2003, G?ran Brostr?m wrote:
On Sun, 8 Jun 2003, Thomas Lumley wrote:
On Sat, 7 Jun 2003, [ISO-8859-1] G?ran Brostr?m wrote:
I need to order a long vector of integers with rather few unique values. This is very slow:
I think the culprit is
src/main/sort.c: orderVector1
/* Shell sort isn't stable, but it proves to be somewhat faster
to run a final insertion sort to re-order runs of ties when
comparison is cheap.
*/
This also explains:
aa<-sample(rep(1:10,50000)) system.time( order(aa, 1:length(aa)))
[1] 3.67 0.01 3.68 0.00 0.00
system.time( order(aa))
^C Timing stopped at: 49.33 0.01 49.34 0 0 which is perhaps the simplest work-around :).
There is a simple and very much faster solution if you don't care about ordering of ties: system.time(ord4 <- sort(x, method="quick", index.return = TRUE)$ix)
Thanks, it is indeed much faster; about 7 times on my example.
That is in the help, and I am very surprised you failed to find it.
Well, I found it, but I was cooled off by the "somewhat faster than Shellsort" and "poor performance in the worst case", together with the fact that I only needed the order and not the sorted vector. Maybe a more positive description is in order :-). Especially on the help page of 'order'. G.