Hi, I have a big dataset of tick data. (Actual transactions at sub-second frequency) I've used R to convert this to an xts object. Iterating through the series to simulate a trading day. I'd like to know when a particular transaction is "on the minute" or "on the 5 minute" mark. Of course, I can use all the great functions of xts to convert the series to bars of any frequency, but that's not what I want. I'd like to iterate over all the transaction and then somehow know when I've hit a bar boundary. One possible issue is if I don't have a transaction exactly at the bar. (More of a theoretical question.) For example, If I have a transaction at 11:59:59, and then the next transaction is at 12:00:01. Then no transaction occurred exactly on the 1 minute boundary. I'm not sure the best way to handle this. The general "big picture' concept is that a trading strategy might only want to trigger on 5 minute bars, but I'd like to keep updating some of my indicators as individual trades flow in. Thoughts? -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095
How to know when an XTS row is at a particular interval
7 messages · Noah Silverman, Brian G. Peterson, Ulrich Staudinger
On Thu, 2011-07-07 at 13:27 -0700, Noah Silverman wrote:
Hi, I have a big dataset of tick data. (Actual transactions at sub-second frequency) I've used R to convert this to an xts object. Iterating through the series to simulate a trading day. I'd like to know when a particular transaction is "on the minute" or "on the 5 minute" mark. Of course, I can use all the great functions of xts to convert the series to bars of any frequency, but that's not what I want. I'd like to iterate over all the transaction and then somehow know when I've hit a bar boundary.
With actual ticks, there will be few or no trades 'on the mark'. This is why most tick-level analysis relies on the prevailing bid and offer, since there is always *some* real price at any given time, even though trades happen at discrete times. Anyway, the function you need is 'endpoints', which will give you the closest stamp to the mark, at whatever periodicity you want. Then you can filter for the almost nonexistant transactions 'on the mark'.
One possible issue is if I don't have a transaction exactly at the bar. (More of a theoretical question.) For example, If I have a transaction at 11:59:59, and then the next transaction is at 12:00:01. Then no transaction occurred exactly on the 1 minute boundary. I'm not sure the best way to handle this. The general "big picture' concept is that a trading strategy might only want to trigger on 5 minute bars, but I'd like to keep updating some of my indicators as individual trades flow in.
Use bids and offers.
Thoughts? -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095
_______________________________________________ R-SIG-Finance at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20110707/bbacb17b/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20110707/d18c449f/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-finance/attachments/20110707/4bdc53d1/attachment.pl>
1 day later
Hi, Endpoints won't always work for me either. In order to use endpoints, I would need to know the whole series ahead of time. If I have transactions streaming in (Like from an IBrokers account) then I can't compute "endpoints". I suppose that I could calculate the minutes and seconds of each transaction and then compare to the bar frequency I want, but that seems computationally excessive. I guess what I want to do is that when a transaction arrives, ask, "Is this on the mark, or close enough to treat it as on the mark". -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095
On Jul 7, 2011, at 1:42 PM, Brian G. Peterson wrote:
On Thu, 2011-07-07 at 13:27 -0700, Noah Silverman wrote:
Hi, I have a big dataset of tick data. (Actual transactions at sub-second frequency) I've used R to convert this to an xts object. Iterating through the series to simulate a trading day. I'd like to know when a particular transaction is "on the minute" or "on the 5 minute" mark. Of course, I can use all the great functions of xts to convert the series to bars of any frequency, but that's not what I want. I'd like to iterate over all the transaction and then somehow know when I've hit a bar boundary.
With actual ticks, there will be few or no trades 'on the mark'. This is why most tick-level analysis relies on the prevailing bid and offer, since there is always *some* real price at any given time, even though trades happen at discrete times. Anyway, the function you need is 'endpoints', which will give you the closest stamp to the mark, at whatever periodicity you want. Then you can filter for the almost nonexistant transactions 'on the mark'.
One possible issue is if I don't have a transaction exactly at the bar. (More of a theoretical question.) For example, If I have a transaction at 11:59:59, and then the next transaction is at 12:00:01. Then no transaction occurred exactly on the 1 minute boundary. I'm not sure the best way to handle this. The general "big picture' concept is that a trading strategy might only want to trigger on 5 minute bars, but I'd like to keep updating some of my indicators as individual trades flow in.
Use bids and offers.
Thoughts? -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095
_______________________________________________ R-SIG-Finance at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
-- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
On Sat, 2011-07-09 at 12:37 -0700, Noah Silverman wrote:
Hi, Endpoints won't always work for me either. In order to use endpoints, I would need to know the whole series ahead of time. If I have transactions streaming in (Like from an IBrokers account) then I can't compute "endpoints".
I'll note that you didn't say that you wanted to apply to streaming data. Your original post said: "I have a big dataset of tick data. (Actual transactions at sub-second frequency)" though you did say: "Iterating through the series to simulate a trading day." You didn't say that you ultimately wanted to apply the same algorithm to streaming data. All backtesting (trading simulation) involves some amount of iteration through the data, assuming that there is some path dependent logic (and real trading is always path dependent, though path-independent assumptions may serve for simple things). Even when 'iterating' through historical data, endpoints is still the most efficient mechanism in R that I'm aware of. Streaming data is necessarily a different problem. R doesn't have any native support for streaming data types that I am aware of. You need to use some sort of sampling or event loop algorithm.
I suppose that I could calculate the minutes and seconds of each transaction and then compare to the bar frequency I want, but that seems computationally excessive. I guess what I want to do is that when a transaction arrives, ask, "Is this on the mark, or close enough to treat it as on the mark".
Well, messages from an exchange are timestamped. Use those. Keep your 'last' transaction in a buffer, and then pop it onto a stack as soon as you have an observation (transaction or not) that is at or past the mark. Jeff has previously reported being able to handle several thougsand messages per second from IB in an event loop in R, which is probably sufficient for most single-instrument modeling. Regards, - Brian
On Jul 7, 2011, at 1:42 PM, Brian G. Peterson wrote:
On Thu, 2011-07-07 at 13:27 -0700, Noah Silverman wrote:
Hi, I have a big dataset of tick data. (Actual transactions at sub-second frequency) I've used R to convert this to an xts object. Iterating through the series to simulate a trading day. I'd like to know when a particular transaction is "on the minute" or "on the 5 minute" mark. Of course, I can use all the great functions of xts to convert the series to bars of any frequency, but that's not what I want. I'd like to iterate over all the transaction and then somehow know when I've hit a bar boundary.
With actual ticks, there will be few or no trades 'on the mark'. This is why most tick-level analysis relies on the prevailing bid and offer, since there is always *some* real price at any given time, even though trades happen at discrete times. Anyway, the function you need is 'endpoints', which will give you the closest stamp to the mark, at whatever periodicity you want. Then you can filter for the almost nonexistant transactions 'on the mark'.
One possible issue is if I don't have a transaction exactly at the bar. (More of a theoretical question.) For example, If I have a transaction at 11:59:59, and then the next transaction is at 12:00:01. Then no transaction occurred exactly on the 1 minute boundary. I'm not sure the best way to handle this. The general "big picture' concept is that a trading strategy might only want to trigger on 5 minute bars, but I'd like to keep updating some of my indicators as individual trades flow in.
Use bids and offers.
Thoughts? -- Noah Silverman UCLA Department of Statistics 8117 Math Sciences Building Los Angeles, CA 90095
_______________________________________________ R-SIG-Finance at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
-- Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock
_______________________________________________ R-SIG-Finance at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. If you want to post, subscribe first. -- Also note that this is not the r-help list where general R questions should go.
Brian G. Peterson http://braverock.com/brian/ Ph: 773-459-4973 IM: bgpbraverock