An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090327/1b85dde4/attachment-0002.pl>
Sorting problem
13 messages · Jun Shen, Bill Venables, Stavros Macrakis +3 more
Perhaps BA[, 2] is a factor? What you might need is something like BA[order(BA[, 1], -as.numeric(BA[, 2]), ] ? Bill Venables.
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Jun Shen [jun.shen.ut at gmail.com]
Sent: 28 March 2009 08:26
To: r-help at stat.math.ethz.ch
Subject: [R] Sorting problem
Sent: 28 March 2009 08:26
To: r-help at stat.math.ethz.ch
Subject: [R] Sorting problem
Hi, everyone,
I was trying to sort a data frame by two columns, one increasing, the other
decreasing and got an error.
"Error in FUN(left) : invalid argument to unary operator",
The command is "BA[order(BA[1],-BA[2]),]". BA is the data frame. It was
working if I used increasing on both columns.
Why the decreasing symbol "-" is not working here? Thanks.
--
Jun Shen PhD
PK/PD Scientist
BioPharma Services
Millipore Corporation
15 Research Park Dr.
St Charles, MO 63304
Direct: 636-720-1589
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Bill.Venables at csiro.au wrote:
Perhaps BA[, 2] is a factor? What you might need is something like BA[order(BA[, 1], -as.numeric(BA[, 2]), ]
More generally, the xtfrm() function converts a vector into a numeric one that sorts in the same order. It will work on character columns as well: BA[order(BA[, 1], -xtfrm(BA[,2]), ] Duncan Murdoch
? Bill Venables.
________________________________________
From: r-help-bounces at r-project.org [r-help-bounces at r-project.org] On Behalf Of Jun Shen [jun.shen.ut at gmail.com]
Sent: 28 March 2009 08:26
To: r-help at stat.math.ethz.ch
Subject: [R] Sorting problem
Hi, everyone,
I was trying to sort a data frame by two columns, one increasing, the other
decreasing and got an error.
"Error in FUN(left) : invalid argument to unary operator",
The command is "BA[order(BA[1],-BA[2]),]". BA is the data frame. It was
working if I used increasing on both columns.
Why the decreasing symbol "-" is not working here? Thanks.
--
Jun Shen PhD
PK/PD Scientist
BioPharma Services
Millipore Corporation
15 Research Park Dr.
St Charles, MO 63304
Direct: 636-720-1589
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090328/f6cd4179/attachment-0002.pl>
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
...More generally, the xtfrm() function converts a vector into a numeric one that sorts in the same order. ...
Thanks, I learn a lot just by reading the answers to other people's
questions on this list.
Some followup questions:
1) Where does the name 'xtfrm' come from?
2) Why isn't xtfrm of a numeric vector the identity function?
3) Similarly, why isn't xtfrm of a logical vector just as.integer of
it? (as it is for factors)
4) Why is xtfrm(c(2,3,2)) => c(1,3,1) and not c(1,2,1) or for that
matter c(2,3,2) (see point 2)?
Thanks,
-s
Stavros Macrakis wrote:
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
...More generally, the xtfrm() function converts a vector into a numeric one that sorts in the same order. ...
Thanks, I learn a lot just by reading the answers to other people's questions on this list. Some followup questions: 1) Where does the name 'xtfrm' come from? 2) Why isn't xtfrm of a numeric vector the identity function? 3) Similarly, why isn't xtfrm of a logical vector just as.integer of it? (as it is for factors) 4) Why is xtfrm(c(2,3,2)) => c(1,3,1) and not c(1,2,1) or for that matter c(2,3,2) (see point 2)?
and one more; ?xtfrm says:
" The default method will make use of '==', '>' and 'is.na' methods
for the class of 'x', but might be rather slow when doing so.
"
and yet:
x = c(1i, 0i)
is(x)
# [1] "complex" "vector"
xtfrm(x)
# [1] 2 1
there is no xtfrm.complex, so the default should be called and it would
make use of '==', '>', and 'is.na' for the class 'complex'. but this fails:
0i > 1i
# error: invalid comparison with complex values
how come? (yes, it's again the flaw of complex being totally ordered
while '>' and relatives cannot be used to establish the order).
vQ
On 28/03/2009 4:57 PM, Stavros Macrakis wrote:
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
...More generally, the xtfrm() function converts a vector into a numeric one that sorts in the same order. ...
Thanks, I learn a lot just by reading the answers to other people's questions on this list. Some followup questions: 1) Where does the name 'xtfrm' come from?
I don't know.
2) Why isn't xtfrm of a numeric vector the identity function?
"Because the implementation does that". It's a one liner: xtfrm.default <- function (x) as.vector(rank(x, ties.method = "min", na.last = "keep")) The identity function would probably be faster, but why optimize cases that are never needed? A better optimization would be just to use order(x), not order(xtfrm(x)). Duncan Murdoch
3) Similarly, why isn't xtfrm of a logical vector just as.integer of it? (as it is for factors)
4) Why is xtfrm(c(2,3,2)) => c(1,3,1) and not c(1,2,1) or for that matter c(2,3,2) (see point 2)?
On Sun, Mar 29, 2009 at 6:15 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
On 28/03/2009 4:57 PM, Stavros Macrakis wrote:
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote: 1) Where does the name 'xtfrm' come from?
I don't know.
Hmm. If the origins of the name are lost in the mists of history, perhaps a more mnemonic/intuitive one could be found?
2) Why isn't xtfrm of a numeric vector the identity function?
"Because the implementation does that". ?It's a one liner:...
Yes, I was asking *why* the implementation does that.
The identity function would probably be faster, but why optimize cases that are never needed? A better optimization would be just to use order(x), not order(xtfrm(x)).
Why should the *calling* function have the burden of type dispatching?
-s
On 29/03/2009 2:14 PM, Stavros Macrakis wrote:
On Sun, Mar 29, 2009 at 6:15 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
On 28/03/2009 4:57 PM, Stavros Macrakis wrote:
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote: 1) Where does the name 'xtfrm' come from?
I don't know.
Hmm. If the origins of the name are lost in the mists of history,
I'm flattered, but I feel obliged to point out that just because I don't know something doesn't mean the information does not exist. Duncan Murdoch
perhaps a more mnemonic/intuitive one could be found?
2) Why isn't xtfrm of a numeric vector the identity function?
"Because the implementation does that". It's a one liner:...
Yes, I was asking *why* the implementation does that.
The identity function would probably be faster, but why optimize cases that are never needed? A better optimization would be just to use order(x), not order(xtfrm(x)).
Why should the *calling* function have the burden of type dispatching?
Duncan Murdoch-2 wrote:
On 29/03/2009 2:14 PM, Stavros Macrakis wrote:
On Sun, Mar 29, 2009 at 6:15 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote:
On 28/03/2009 4:57 PM, Stavros Macrakis wrote:
On Sat, Mar 28, 2009 at 7:53 AM, Duncan Murdoch <murdoch at stats.uwo.ca> wrote: 1) Where does the name 'xtfrm' come from?
I don't know.
Hmm. If the origins of the name are lost in the mists of history,
I'm flattered, but I feel obliged to point out that just because I don't know something doesn't mean the information does not exist. Duncan Murdoch
From the SVN log:
==========
r46470 | ripley | 2008-08-31 12:53:42 -0400 (Sun, 31 Aug 2008) | 3 lines
add xtfrm, use it in order() etc
make is.unsorted work on classed objects
==========
Adding
xtfrm.numeric <- function(x) {x}
would seem to add the case that people are looking for -- unless
there's something special that needs to be handled with NAs ???
Ben Bolker
View this message in context: http://www.nabble.com/Sorting-problem-tp22751075p22772920.html Sent from the R help mailing list archive at Nabble.com.
On Sun, Mar 29, 2009 at 5:21 PM, Ben Bolker <bolker at ufl.edu> wrote:
?Adding
xtfrm.numeric <- function(x) {x}
would seem to add the case that people are looking for -- unless
there's something special that needs to be handled with NAs ???
Yes, that was what I was suggesting. xtfrm currently converts NaN to
NA, but that does not seem to be necessary for it to meet its
specification of "producing a numeric vector which will sort in the
same order as 'x'", since sort treats NaNs the same as NAs.
-s
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090329/ad013730/attachment-0002.pl>
Here's a clue:
rank(a)
[1] 1 3 4 7 2 5 6
order(order(a))
[1] 1 3 4 7 2 5 6 W. Bill Venables http://www.cmis.csiro.au/bill.venables/ -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Jun Shen Sent: Monday, 30 March 2009 10:14 AM To: Stavros Macrakis Cc: r-help at r-project.org; Ben Bolker Subject: Re: [R] Sorting problem That's leading to another question: How does rank() work? If I have a character vector a<- c("2a", "2c", "3", "5" , "2b" ,"4a", "4b") Then a[order(a)] returns "2a" "2b" "2c" "3" "4a" "4b" "5", which makes sense But a[rank(a)] returns "2a" "3" "5" "4b" "2c" "2b" "4a", which does not seem to make sense. Similarly, for a numeric vector b<-c( 2, 3, 1 , 6 , 5, 10, 4 , 7 , 9 , 8) b[order(b)] returns 1 2 3 4 5 6 7 8 9 10 b[rank(b)] returns 3 1 2 10 5 8 6 4 9 7 Any explanation? Thanks.
On Sun, Mar 29, 2009 at 5:42 PM, Stavros Macrakis <macrakis at alum.mit.edu>wrote:
On Sun, Mar 29, 2009 at 5:21 PM, Ben Bolker <bolker at ufl.edu> wrote:
Adding
xtfrm.numeric <- function(x) {x}
would seem to add the case that people are looking for -- unless
there's something special that needs to be handled with NAs ???
Yes, that was what I was suggesting. xtfrm currently converts NaN to
NA, but that does not seem to be necessary for it to meet its
specification of "producing a numeric vector which will sort in the
same order as 'x'", since sort treats NaNs the same as NAs.
-s
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jun Shen PhD PK/PD Scientist BioPharma Services Millipore Corporation 15 Research Park Dr. St Charles, MO 63304 Direct: 636-720-1589 [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.