Speed optimization on minutes distribution calculation
I think you want something like ?aggregate.zoo
I didn't pull actual volume data, but here is an example that will
show what you can do:
library(xts) ## only used for the sequence and to leverage
aggregate.zoo internally.
## generate a sequence of POSIXct 1 mo @ 1min
x <- timeBasedSeq('20090515/20090615 12:00')
## convert to POSIXlt and turn into HHMM numeric format
hm <- as.POSIXlt(x)$min + as.POSIXlt(x)$hour * 100
## your original "Volume" column (here a simple xts object with each
min having Vol=1000)
## There are 32 observations at each minute in 00:00--12:00 and 31
for 12:01--23:59
xx <- xts(rep(1000,length(x)), x)
## using 'aggregate' to apply sum to the matching times
ax <- aggregate(xx, as.factor(hm), sum)
head(ax)
0 32000
1 32000
2 32000
3 32000
4 32000
5 32000
tail(ax)
2354 31000 2355 31000 2356 31000 2357 31000 2358 31000 2359 31000 I haven't had a chance to actually test this, but at the very least it should provide a start for you. And the above is very fast: system.time(ax <- aggregate(xx, as.factor(hm), sum)) user system elapsed 0.058 0.015 0.073 HTH Jeff
On Mon, Jun 15, 2009 at 9:27 PM, Wind<windspeedo99 at gmail.com> wrote:
periodicity() function in xts is a good tool for axis manipulation. Maybe I should not use character string methods to complie the distribution of minutes volume, as Brian suggested. ? But what function should be used for such task in R? ?I've tried in kdb+ , it is ?somewhat simple and quick enough with select and xbar function. But I am not familiar with R. ?Maybe there is some functions for this specific task I don't know. Thanks Brian. On Tue, Jun 16, 2009 at 8:00 AM, Brian G. Peterson<brian at braverock.com> wrote:
It seems that the slow part is all the character string manipulation. ?This would be slow in almost any programming language. ? Honestly, I am always annoyed by useless axes in charts that simply count from 1 to n. ?A time axis at least has some real meaning, and avoids the useless rewriting of character strings. You should be able to get a meaningful, readable axis using the periodicity() function in xts without the string manipulation. Regards, ? - Brian Wind wrote:
I want to plot the distribution of volume of the future ?CLN9 along
the 24 hours axis. ? The following codes could complete the task. ?But
it is very time consuming when sapply(mins,function(x)
{mean(hqm[which(format(index(hqm),"%H:%M")==x),5])}).
Any suggestion for codes with better performance would be highly
appreciated.
The data hqm has been retrieved from IB via IBrokers.
head(hqm[,5])
? ? ? ? ? ? ? ? ? ?CLN9.Volume 2009-05-25 06:00:00 ? ? ? ? ?17 2009-05-25 06:01:00 ? ? ? ? ? 2 2009-05-25 06:02:00 ? ? ? ? ?11 2009-05-25 06:03:00 ? ? ? ? ?26 2009-05-25 06:04:00 ? ? ? ? ?20 2009-05-25 06:05:00 ? ? ? ? ? 5
tail(hqm[,5])
? ? ? ? ? ? ? ? ? ?CLN9.Volume 2009-06-15 21:51:00 ? ? ? ?1050 2009-06-15 21:52:00 ? ? ? ? 807 2009-06-15 21:53:00 ? ? ? ? 782 2009-06-15 21:54:00 ? ? ? ? 385 2009-06-15 21:55:00 ? ? ? ? 562 2009-06-15 21:56:00 ? ? ? ? 423
mins<-unlist(lapply(0:23,function(h){sapply(0:59,function(m){paste(sprintf("%02d",h),sprintf("%02d",m),sep=":")})}))
head(mins)
[1] "00:00" "00:01" "00:02" "00:03" "00:04" "00:05"
tail(mins)
[1] "23:54" "23:55" "23:56" "23:57" "23:58" "23:59"
temp<-sapply(mins,function(x)
{mean(hqm[which(format(index(hqm),"%H:%M")==x),5])})
head(temp)
? 00:00 ? ?00:01 ? ?00:02 ? ?00:03 ? ?00:04 ? ?00:05 279.1333 284.9333 247.8667 176.3333 278.8667 179.0667
tail(temp)
? 23:54 ? ?23:55 ? ?23:56 ? ?23:57 ? ?23:58 ? ?23:59 250.2667 312.7333 318.9333 210.8000 258.2000 232.8667
plot(temp)
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.
-- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.
Jeffrey Ryan jeffrey.ryan at insightalgo.com ia: insight algorithmics www.insightalgo.com