Skip to content

Sorting problem

13 messages · Jun Shen, Bill Venables, Stavros Macrakis +3 more

#
Perhaps BA[, 2] is a factor?  What you might need is something like

BA[order(BA[, 1], -as.numeric(BA[, 2]), ]

?

Bill Venables.
#
Bill.Venables at csiro.au wrote:
More generally, the xtfrm() function converts a vector into a numeric 
one that sorts in the same order.  It will work on character columns as 
well:

BA[order(BA[, 1], -xtfrm(BA[,2]), ]

Duncan Murdoch
#
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
Thanks, I learn a lot just by reading the answers to other people's
questions on this list.

Some followup questions:

1) Where does the name 'xtfrm' come from?

2) Why isn't xtfrm of a numeric vector the identity function?

3) Similarly, why isn't xtfrm of a logical vector just as.integer of
it? (as it is for factors)

4) Why is xtfrm(c(2,3,2)) => c(1,3,1) and not c(1,2,1) or for that
matter c(2,3,2) (see point 2)?

Thanks,

          -s
#
Stavros Macrakis wrote:
and one more;  ?xtfrm says: 

"     The default method will make use of '==', '>'  and 'is.na' methods
     for the class of 'x', but might be rather slow when doing so.
"

and yet:

    x = c(1i, 0i)
    is(x)
    # [1] "complex" "vector"

    xtfrm(x)
    # [1] 2 1

there is no xtfrm.complex, so the default should be called and it would
make use of '==', '>', and 'is.na' for the class 'complex'.  but this fails:

    0i > 1i
    # error: invalid comparison with complex values

how come?  (yes, it's again the flaw of complex being totally ordered
while '>' and relatives cannot be used to establish the order).

vQ
#
On 28/03/2009 4:57 PM, Stavros Macrakis wrote:
I don't know.
"Because the implementation does that".  It's a one liner:

xtfrm.default <- function (x)
  as.vector(rank(x, ties.method = "min", na.last = "keep"))

The identity function would probably be faster, but why optimize cases 
that are never needed?  A better optimization would be just to use 
order(x), not order(xtfrm(x)).

Duncan Murdoch
#
On Sun, Mar 29, 2009 at 6:15 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
Hmm.  If the origins of the name are lost in the mists of history,
perhaps a more mnemonic/intuitive one could be found?
Yes, I was asking *why* the implementation does that.
Why should the *calling* function have the burden of type dispatching?

          -s
#
On 29/03/2009 2:14 PM, Stavros Macrakis wrote:
I'm flattered, but I feel obliged to point out that just because I don't 
know something doesn't mean the information does not exist.

Duncan Murdoch
#
Duncan Murdoch-2 wrote:
==========
r46470 | ripley | 2008-08-31 12:53:42 -0400 (Sun, 31 Aug 2008) | 3 lines

add xtfrm, use it in order() etc
make is.unsorted work on classed objects
==========

 Adding

xtfrm.numeric <- function(x) {x}

would seem to add the case that people are looking for -- unless
there's something special that needs to be handled with NAs ???

  Ben Bolker
#
On Sun, Mar 29, 2009 at 5:21 PM, Ben Bolker <bolker at ufl.edu> wrote:
Yes, that was what I was suggesting.  xtfrm currently converts NaN to
NA, but that does not seem to be necessary for it to meet its
specification of "producing a numeric vector which will sort in the
same order as 'x'", since sort treats NaNs the same as NAs.

        -s
#
Here's a clue:
[1] 1 3 4 7 2 5 6
[1] 1 3 4 7 2 5 6

W. 


Bill Venables
http://www.cmis.csiro.au/bill.venables/ 


-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jun Shen
Sent: Monday, 30 March 2009 10:14 AM
To: Stavros Macrakis
Cc: r-help at r-project.org; Ben Bolker
Subject: Re: [R] Sorting problem

That's leading to another question: How does rank() work?

If I have a character vector
 a<- c("2a", "2c", "3",  "5" , "2b" ,"4a", "4b")
Then a[order(a)] returns
"2a" "2b" "2c" "3"  "4a" "4b" "5", which makes sense
But a[rank(a)] returns
 "2a" "3"  "5"  "4b" "2c" "2b" "4a", which does not seem to make sense.

Similarly, for a numeric vector
b<-c( 2,  3,  1 , 6 , 5, 10,  4 , 7 , 9 , 8)
b[order(b)] returns  1  2  3  4  5  6  7  8  9 10
b[rank(b)] returns  3  1  2 10  5  8  6  4  9  7

Any explanation? Thanks.
On Sun, Mar 29, 2009 at 5:42 PM, Stavros Macrakis <macrakis at alum.mit.edu>wrote: