Skip to content

Proposal unary - operator for factors

10 messages · Hadley Wickham, William Dunlap, Duncan Murdoch +1 more

#
Hi all,

Why not make the unary minus operator return the factor with levels
reversed?  This would make it much easier to sort factors in
descending order in part of an order statement.

Hadley
#
It wouldn't make sense in the context of
   vector[-factor]

Wouldn't it be better to allow order's decreasing argument
to be a vector with one element per ... argument?  That
would work for numbers, factors, dates, and anything
else.  Currently order silently ignores decreasing[2] and
beyond.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
True, but that doesn't work currently so you wouldn't lose anything.
However, it would make a certain class of problem that used to throw
errors become silent.
The problem is you might want to do something like order(a, -b, c, -d)

Hadley
#
Currently, for numeric a you can do either
   order(-a)
or
   order(a, decreasing=FALSE)
For nonnumeric types like POSIXct and factors only
the latter works.

Under my proposal your
   order(a, -b, c, d)
would be
   order(a, b, c, d, decreasing=c(FALSE,TRUE,FALSE,TRUE))
and it would work for any ordably class without modifications
to any classes.

Bill
#
On 03/02/2010 6:49 PM, William Dunlap wrote:
Why not use

  order(a, -xtfrm(b), c, -xtfrm(d))

??

Duncan Murdoch
#
That's a good suggestion.  You could make it even easier to read with
desc <- function(x) -xtfrm(x)

order(a, desc(b), c, desc(d))

Could you remind me what xtfrm stands for?

Thanks!

Hadley
#
You could, if you can remember it.  I have been annoyed
that decreasing= was in order() but not as useful as it
could be since it is not vectorized.  The same goes for
na.last, although that seems less useful to me.

Here is a version of order (based on the
algorithm using in S+'s order) that
vectorizes the na.last and decreasing
arguments.  It calls the existing order
function to implement decreasing=TRUE/FALSE
and na.last=TRUE/FALSE for a single argument
but order itself could be mofified in this
way.

new.order <- function (..., na.last = TRUE, decreasing = FALSE) 
{
    vectors <- list(...)
    nVectors <- length(vectors)
    stopifnot(nVectors > 0)
    na.last <- rep(na.last, length = nVectors)
    decreasing <- rep(decreasing, length = nVectors)
    keys <- seq_len(length(vectors[[1]]))
    for (i in nVectors:1) {
        v <- vectors[[i]]
        if (length(v) < length(keys)) 
            v <- rep(v, length = length(keys))
        keys <- keys[order(v[keys], na.last = na.last[i], decreasing =
decreasing[i])]
    }
    keys
}

With the following dataset

data <- data.frame(
  ct = as.POSIXct(c("2009-01-01", "2010-02-03",
"2010-02-28"))[c(2,2,2,3,3,1)],
  dt =    as.Date(c("2009-01-01", "2010-02-03",
"2010-02-28"))[c(3,2,2,2,3,1)],
  fac =  factor(c("Small","Medium","Large"),
levels=c("Small","Medium","Large"))[c(1,3,2,3,3,1)],
  n  =    c(11,12,12,11,12,12))
ct         dt    fac  n
1 2010-02-03 2010-02-28  Small 11
2 2010-02-03 2010-02-03  Large 12
3 2010-02-03 2010-02-03 Medium 12
4 2010-02-28 2010-02-03  Large 11
5 2010-02-28 2010-02-28  Large 12
6 2009-01-01 2009-01-01  Small 12
ct  dt fac   n
1 3.0 5.5 1.5 1.5
2 3.0 3.0 5.0 4.5
3 3.0 3.0 3.0 4.5
4 5.5 3.0 5.0 1.5
5 5.5 5.5 5.0 4.5
6 1.0 1.0 1.5 4.5

we get (where my demos use rank because I could remember
the name xtfrm):
[1] TRUE
new.order(fac,n,decreasing=c(FALSE,TRUE))))
[1] TRUE
new.order(ct,dt,decreasing=c(FALSE,TRUE))))
[1] TRUE
new.order(ct,fac,decreasing=c(FALSE,TRUE))))
[1] TRUE
new.order(n,fac,decreasing=c(FALSE,TRUE))))
[1] TRUE

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
#
On 03/02/2010 7:20 PM, Hadley Wickham wrote:
No, I don't think I ever worked it out. :-)

Duncan Murdoch
#
On Wed, 3 Feb 2010, Duncan Murdoch wrote:

            
The same logic as strxfrm.

  
    
#
strxfrm is short for string transform.
=> stxfrm is short for string tansform
=> txfrm is short for tansform
=> xtfrm is short of snatform?

Hadley