Interpolating/comparing two irregulartime/price sequences?
Thanks everyone for their extremely helpful comments on this issue. Eric, that is a very interesting point you have raised. Did Peter publish a paper on this topic? If so, do you happen to know the title? I feel intuitively that the previous tick method should be more reliable than interpolation for high-frequency data, although it would be nice to see some research on this topic confirming this to be the case. Thanks Rory
On Nov 8, 2007 9:33 PM, Eric Zivot <ezivot at u.washington.edu> wrote:
Just a few quick comments on this issue The Olsen group book, Introduction to High Frequency Finance, discusses various interpolation schemes to align multiple irregularly spaced data. For realized variance modeling Peter Hansen at Stanford showed that one should use the "previous tick" method for aligning data to a common time clock and not an linear interpolation around neighboring ticks. The latter method leads to degenerate results as you sample more frequently since the quadratic variation of a line is zero. The type of alignment discussed below is handled in the timeSeries class in S-PLUS using the align() function. Diethelm Wuertz implemented a subset of this class in R and I think the align() function is there too.
________________________________ From: r-sig-finance-bounces at stat.math.ethz.ch [mailto:r-sig-finance-bounces at stat.math.ethz.ch] On Behalf Of Adrian Trapletti Sent: Thursday, November 08, 2007 3:37 AM To: rory.winston at gmail.com Cc: R-Finance Subject: Re: [R-SIG-Finance] Interpolating/comparing two irregulartime/price sequences? Rory, There is no best method for synchronizing high frequency data. It depends on the application. One of the pioneers for high frequency financial data modelling was http://www.olsen.ch . In the 90ies they published some articles where they used interpolation schemes to model irregularly spaced high frequency data with standard discrete time series methods. You can find some articles on their website. Currently, there is a lot of work on the topic realized variance/volatility, and when it comes to multivariate applications, you may find some methods there http://www.google.ch/search?hl=en&q=%22realized+covariance%22&btnG=Search&meta= Best regards Adrian Message: 1 Date: Wed, 7 Nov 2007 18:31:16 +0000 From: "Rory Winston" <rory.winston at gmail.com> Subject: [R-SIG-Finance] Interpolating/comparing two irregular time/price sequences? To: r-sig-finance at stat.math.ethz.ch Message-ID: <3f446aa30711071031j37936e36i933be63c90f9ce4c at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi all I have two data frames, that both look like the following: head(series1) timestamp mid spread 1 1.194438e+12 2.10011 0.000260 2 1.194438e+12 2.10010 0.000290 ... These two time sequences are sampled on price ticks, so the interval between ticks is stochastic and irregular. The time sequences are also of different lengths, i.e. one may have 8 hours worth of data, the other may have 4. My issue is that I want to compare these two series for similarity - they should be producing almost exactly the same data, although potentially at slightly different timestamps (hence the sampling irregularity). I can subset the data so that they span roughly the same time intervals, but the number of ticks in each series will be different. Basically what I am trying to achieve is some sort of constant interpolation based on a time index - so that if series A starts at 08:01, contains 10,000 ticks, and ends at 16:05, and series B starts at 08:00, contains 7,000 ticks, and ends at 16:06, I would like to be able to index from series A into series B at say, each timestamp in A. Using a simple example, for the following series A and B: A: time tick 16:01 2.05 16:02 2.06 B: time tick 16:00 2.04 16:02 2.06 I would like to be able to index from A into B at each tick from A, so I would get an output series that was the value of B at each time A ticked: C time tick 16:01 2.04 <--- constant interpolation from value of B @ 16:00 16:02 2.06 Has anyone done anything like this before? I'm looking at the zoo package to see if it can help me, but I havent quite figured out how to do this kind of thing yet. Is this even a good way to checking whether series B is very similar to series A at the discrete tick intervals? Any better methods?(I guess another way might be to align the two subsetted series exactly and just take differences). Thanks Rory -- Adrian Trapletti Wildsbergstrasse 31 8610 Uster Switzerland Phone : +41 (0) 44 9945630 Mobile : +41 (0) 76 3705631 Email : a.trapletti at swissonline.ch