Skip to content
Prev 304085 / 398506 Next

Extracting data from dataframe with tied rows

Another strategy is to sort by month, id, and, in reverse order,
distance and select the rows that start each month/id run.  This
can be much faster than the other ways when there are lots of
month/id combinations.

f1 <- function (DATA) 
{
    stopifnot(is.data.frame(DATA),
                      all(c("distance", "id", "month") %in% names(DATA)))
    DATA <- DATA[order(DATA$month, DATA$id, -DATA$distance), ]
    ldiff <- function(x) c(TRUE, x[-1] != x[-length(x)])
    DATA[ldiff(DATA$month) | ldiff(DATA$id), ]
}

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com