aggregate slow with variables of type 'dates' - how to solve

Dear all
I use aggregate with variables of type numeric and dates. For type numeric
functions, such as sum() are very fast, but similar simple functions, such
as min() are much slower for the variables of type 'dates'. The difference
gets bigger the larger the 'id' var is - but see this sample code:

dts <- dates(c("02/27/92", "02/27/92", "01/14/92",
              "02/28/92", "02/01/92"))
ntimes <- 700000
dts <- data.frame(rep(c(1:40), ntimes/8),
                 chron(rep(dts, ntimes), format = c(dates = "m/d/y")),
                 rep(c(0.123, 0.245, 0.423, 0.634, 0.256), ntimes))
names(dts) <- c("id", "date", "tbs")

date()
dat.1st <- aggregate(dts$date, list(id = dts$id), min)$x
dat.1st <- chron(dat.1st, format = c(dates = "m/d/y"))
dat.1st
date() #82 seconds

date()
tbs.s <- aggregate(as.numeric(dts$tbs),list(id = dts$id), sum)
tbs.s
date() #17 seconds

--- is it a problem of data-type 'dates' ? if yes, is there any solution
to solve this, since for huge data-sets, this can be a problem...

as I mentioned, e.g. if we have for variable 'id' eg just 5 levels, the
two times are roughly the same, but with the 40 different ids, we have
this big difference

thanks a lot

Christoph

--

aggregate slow with variables of type 'dates' - how to solve

Thread (3 messages)