high frequency data analysis in R
I want to see what statistical experiments I can run on my data. The very first thing came to my mind was the "correlation" ... But I am not sure if the concept of usual "correlation" is directly applicable after I resampled the data into regularly spaced data. But then again another question is what's a good resampling period? Maybe "correlation" is sensitive to the resampling period...
On Thu, May 21, 2009 at 1:37 PM, <markleeds at verizon.net> wrote:
in that case, it begs the question of why you want to regularly space your data ? all the info is there so why reduce the amount of it by regularly spacing ? On May 21, 2009, Michael <comtech.usa at gmail.com> wrote: In fact, I have the whole jump processes of best bid, and best ask, at a continuous level (in the sense of time-stamped arrival data), and also the jump process of the last trade price, at a continuous level (in the sense of time-stamped arrival data). Any more thoughts? On Thu, May 21, 2009 at 9:51 AM, Hae Kyung Im <hakyim at gmail.com> wrote:
Relating the approach that turns irregular data into regular one, I guess it's a complex question and how you approach it will depend on the specific problem. With your method, you would assume that the price is equal to the last traded price or something like that. If there is no trade for some time, would it make sense to say that the price is the last traded price? If you wanted to actually buy/sell at that price, it's not obvious that you'll be able to do so. Also, if you only look at the time series of instantaneous prices, you would be losing a lot of information about what happened in between the time points. It makes more sense to aggregate and keep, for example, open, high, low and close. Or some statistics on the distribution of the prices between the endpoints. If what you need to calculate is correlations, then I would look at the papers that Liviu suggested. It seems that synchronicity is critical. I heard there is an extension of TSRV to correlations. If you only need to look at univariate time series, you may be able to get away with your method more easily. It may not be statistically efficient but may give you a good enough answer in some cases. HTH Haky On Thu, May 21, 2009 at 10:38 AM, Michael <comtech.usa at gmail.com> wrote:
My data are price change arrivals, irregularly spaced. But when there is no price change, the price stays constant. Therefore, in fact, at any time instant, you give me a time, I can give you the price at that very instant of time. So irregularly spaced data can be easily sampled to be regularly spaced data. What do you think of this approach? On Thu, May 21, 2009 at 8:21 AM, Michael <comtech.usa at gmail.com> wrote:
Thanks Jeff. By high frequency I mean really the tick data. For example, during peak time, the arrival of price events could be at about hundreds to thousands within one second, irregularly spaced. I've heard that forcing irregularly spaced data into regularly spaced data(e.g. through interpolation) will lose information. How's that so? Thanks! On Thu, May 21, 2009 at 8:15 AM, Jeff Ryan <jeff.a.ryan at gmail.com> wrote:
Not my domain, but you will more than likely have to aggregate to some sort of regular/homogenous type of series for most traditional tools to work. xts has to.period to aggregate up to a lower frequency from tick-level data. Coupled with something like na.locf you can make yourself some high frequency 'regular' data from 'irregular' Regular and irregular of course depend on what you are looking at (weekends missing in daily data can still be 'regular'). I'd be interested in hearing thoughts from those who actually tread in the high-freq domain... A wealth of information can be found here: ?http://www.olsen.ch/publications/working-papers/ Jeff On Thu, May 21, 2009 at 10:04 AM, Michael <comtech.usa at gmail.com> wrote:
Hi all, I am wondering if there are some special toolboxes to handle high frequency data in R? I have some high frequency data and was wondering what meaningful experiments can I run on these high frequency data. Not sure if normal (low frequency) financial time series textbook data analysis tools will work for high frequency data? Let's say I run a correlation between two stocks using the high frequency data, or run an ARMA model on one stock, will the results be meaningful? Could anybody point me some classroom types of treatment or lab tutorial type of document which show me what meaningful experiments/tests I can run on high frequency data? Thanks a lot!
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.
-- Jeffrey Ryan jeffrey.ryan at insightalgo.com ia: insight algorithmics www.insightalgo.com
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.