Skip to content
Prev 4322 / 15274 Next

Speed optimization on minutes distribution calculation

I think you want something like ?aggregate.zoo

I didn't pull actual volume data, but here is an example that will
show what you can do:

library(xts)  ## only used for the sequence and to leverage
aggregate.zoo internally.

## generate a sequence of POSIXct 1 mo @ 1min
x <- timeBasedSeq('20090515/20090615 12:00')

## convert to POSIXlt and turn into HHMM numeric format
hm <- as.POSIXlt(x)$min + as.POSIXlt(x)$hour * 100

##  your original "Volume" column (here a simple xts object with each
min having Vol=1000)
##  There are 32 observations at each minute in 00:00--12:00 and 31
for 12:01--23:59
xx <- xts(rep(1000,length(x)), x)

##  using 'aggregate' to apply sum to the matching times
ax <- aggregate(xx, as.factor(hm), sum)

head(ax)

0 32000
1 32000
2 32000
3 32000
4 32000
5 32000
2354 31000
2355 31000
2356 31000
2357 31000
2358 31000
2359 31000

I haven't had a chance to actually test this, but at the very least it
should provide a start for you.

And the above is very fast:

 system.time(ax <- aggregate(xx, as.factor(hm), sum))
   user  system elapsed
  0.058   0.015   0.073

HTH
Jeff
On Mon, Jun 15, 2009 at 9:27 PM, Wind<windspeedo99 at gmail.com> wrote: