vector labels are not permuted properly in a call to sort() (R 2.1)
The main problem is that R is inconsistent here. There are lots of branches through the sort() code. Greg showed one. Here are four more
sort(y, method="quick")
[,1] [,2] A 1 5 B 2 6 C 3 7 D 4 8
names(y) <- letters[1:8] sort(y)
h g f e d c b a 1 2 3 4 5 6 7 8
sort(y, method="quick")
[,1] [,2] A 1 5 B 2 6 C 3 7 D 4 8 attr(,"names") [1] "h" "g" "f" "e" "d" "c" "b" "a"
sort(y, partial=4)
[,1] [,2] A 1 5 B 2 6 C 3 7 D 4 8 attr(,"names") [1] "a" "b" "c" "d" "e" "f" "g" "h" I believe Svr4 does keep names but does not allow names on matrices. There are other problems: should sorting a time-series preserve the ts properties (probably not, but it does). Should (S3 or S4) class information be preserved (it seems inappropriate for a time series, for example)? The course of least resistance here is to always preserve attributes and to document that we do so. Probably the most S-compliant solution is to preserve only names (and sort them as now). David James quotes the Blue Book, but note that S itself no longer follows the principle stated there.
On Wed, 5 Oct 2005, Martin Maechler wrote:
"AndyL" == Liaw, Andy <andy_liaw at merck.com>
on Tue, 4 Oct 2005 13:51:11 -0400 writes:
AndyL> The `problem' is that sort() does not doing anything special when given
AndyL> a matrix: it only treat it as a vector. After sorting, it copies
AndyL> attributes of the original input to the output. Since dimnames are
AndyL> attributes, they get copied as is.
exactly. Thanks Andy.
And I think users would want this (copying of attributes) in
many cases; in particular for user-created attributes
?sort really talks about sorting of vectors and factors;
and it doesn't mention attributes explicitly at all
{which should probably be improved}.
One could wonder if R should keep the dim & dimnames
attributes for arrays and matrices.
S-plus (6.2) simply drops them {returning a bare unnames vector}
and that seems pretty reasonable to me.
At least the user would never make the wrong assumptions that
Greg made about ``matrix sorting''.
AndyL> Try:
>> y <- matrix(8:1, 4, 2, dimnames=list(LETTERS[1:4], NULL)) >> y
AndyL> [,1] [,2] AndyL> A 8 4 AndyL> B 7 3 AndyL> C 6 2 AndyL> D 5 1
>> sort(y)
AndyL> [,1] [,2] AndyL> A 1 5 AndyL> B 2 6 AndyL> C 3 7 AndyL> D 4 8 AndyL> Notice the row names stay the same. I'd argue that this is the correct AndyL> behavior. AndyL> Andy
>> From: Greg Finak >> >> Not sure if this is the correct forum for this,
yes, R-devel is the proper forum.
{also since this is really a proposal for a change in R ...}
>> but I've found what I >> would consider to be a potentially serious bug to the >> unsuspecting user. >> Given a numeric vector V with class labels in R, the following calls >> >> 1.
>> > sort(as.matrix(V))
>> >> and >> >> 2.
>> >as.matrix(sort(V))
>> >> produce different ouput. The vector is sorted properly in >> both cases, >> but only 2. produces the correct labeling of the vector. The call to >> 1. produces a vector with incorrect labels (not sorted). >> >> Code:
>> >X<-c("A","B","C","D","E","F","G","H")
>> >Y<-rev(1:8)
>> >names(Y)<-X
>> > Y
>> A B C D E F G H >> 8 7 6 5 4 3 2 1
>> > sort(as.matrix(Y))
>> [,1] >> A 1 >> B 2 >> C 3 >> D 4 >> E 5 >> F 6 >> G 7 >> H 8
>> > as.matrix(sort(Y))
>> [,1] >> H 1 >> G 2 >> F 3 >> E 4 >> D 5 >> C 6 >> B 7 >> A 8 >>
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595