Vectorized rolling computation on xts series
Another approach would be to use zoo or xts's lag function to to generate a dataframe or matrix with the current day's data and N previous periods in a table. If your data is fairly univariate, this shouldn't prove a problem, just do a little math and you can specify easily how many days of data to "go back". You'd do something like this: starting with: d1 d2 d3 d4 d5 use Lag (or lag, they behave differently), then t() and apply() and end up with: d1 na na na d2 d1 na na d3 d2 d1 na d4 d3 d2 d1 d5 d4 d3 d2 you can then easily run your computations in a vectored form using the apply family of functions.
On Wed, Oct 7, 2009 at 3:30 AM, Mark Breman <breman.mark at gmail.com> wrote:
Hi Shane, I had a look at these functions but they do not satisfy my constraints: - apply.monthly works with 'calendar months', but I need a function that allows me to specify for instance 1995-01-06 until 1995-02-06 (i.e. 'duration' of one month) for the computation of element x = 1995-02-06 - rollapply (and also rollmax, rollmin) need a specification of the number of previous elements from the series if I understand it correctly. As you can see in the example it is daily data but with lots of gaps, so this would be very difficult to do if at all possible. Thanks for your quick response though, Kind regards, -Mark- 2009/10/7 Shane <shane.conway at gmail.com>
I think you want the apply.monthly function in xts. It also has other time periods (eg daily). You may also want to look at rollapply in zoo. Sent from my iPhone On Oct 7, 2009, at 4:05 AM, Mark Breman <breman.mark at gmail.com> wrote: ?Hi,
I have a univariate xts timeseries (daily data) for which I need to apply a computation for each element. The computation for element x needs the last y months of the data from the timeseries. What's more, I need a "vectorized" computation because looping over all elements is too slow (it's a large timeseries). I think this is what is called a "rolling" or "running" computation in R. The computation I need to do for element x is: - calculate the percentage of the value x within the range of values from the last y months, i.e. determine the min() and max() of the last y months of data (including x), and determine what percentage of this range the value x is. For example: min(last 1 months) == 10, max(last 1 months) == 50, x == 20 would yield: 25% - elements for which y months of previous data (including x itself) is not available should become NaN or some other "special value". An example So let's say I have a timeseries called "data": ?data
? ? ? ? ?NonCommNet 1995-01-03 ? ? ?44580 1995-01-04 ? ? ?44580 1995-01-05 ? ? ?44580 1995-01-06 ? ? ?44580 1995-01-09 ? ? ?44580 1995-01-10 ? ? ?32835 1995-01-11 ? ? ?32835 1995-01-12 ? ? ?32835 1995-01-13 ? ? ?32835 1995-01-16 ? ? ?32835 1995-01-17 ? ? ?38385 1995-01-18 ? ? ?38385 1995-01-19 ? ? ?38385 1995-01-20 ? ? ?38385 1995-01-23 ? ? ?38385 1995-01-24 ? ? ?19150 1995-01-25 ? ? ?19150 1995-01-26 ? ? ?19150 1995-01-27 ? ? ?19150 1995-01-30 ? ? ?19150 1995-01-31 ? ? ?15245 1995-02-01 ? ? ?15245 1995-02-02 ? ? ?15245 1995-02-03 ? ? ?15245 1995-02-06 ? ? ?15245 1995-02-07 ? ? ?24110 1995-02-08 ? ? ?24110 1995-02-09 ? ? ?24110 1995-02-10 ? ? ?24110 1995-02-13 ? ? ?24110 1995-02-14 ? ? ?17615 1995-02-15 ? ? ?17615 1995-02-16 ? ? ?17615 1995-02-17 ? ? ?17615 1995-02-21 ? ? -23080 1995-02-22 ? ? -23080 1995-02-23 ? ? -23080 1995-02-24 ? ? -23080 1995-02-27 ? ? -23080 1995-02-28 ? ? -17445 I tried the following "vectorized" solution ( example with y = 1 month):
((data - min(last(data, "1 months"))) / (max(last(data, "1 months")) -
min(last(data, "1 months")))) * 100 ? ? ? ? ?NonCommNet 1995-01-03 ?143.37783 1995-01-04 ?143.37783 1995-01-05 ?143.37783 1995-01-06 ?143.37783 1995-01-09 ?143.37783 1995-01-10 ?118.48909 1995-01-11 ?118.48909 1995-01-12 ?118.48909 1995-01-13 ?118.48909 1995-01-16 ?118.48909 1995-01-17 ?130.25005 1995-01-18 ?130.25005 1995-01-19 ?130.25005 1995-01-20 ?130.25005 1995-01-23 ?130.25005 1995-01-24 ? 89.48930 1995-01-25 ? 89.48930 1995-01-26 ? 89.48930 1995-01-27 ? 89.48930 1995-01-30 ? 89.48930 1995-01-31 ? 81.21424 1995-02-01 ? 81.21424 1995-02-02 ? 81.21424 1995-02-03 ? 81.21424 1995-02-06 ? 81.21424 1995-02-07 ?100.00000 1995-02-08 ?100.00000 1995-02-09 ?100.00000 1995-02-10 ?100.00000 1995-02-13 ?100.00000 1995-02-14 ? 86.23649 1995-02-15 ? 86.23649 1995-02-16 ? 86.23649 1995-02-17 ? 86.23649 1995-02-21 ? ?0.00000 1995-02-22 ? ?0.00000 1995-02-23 ? ?0.00000 1995-02-24 ? ?0.00000 1995-02-27 ? ?0.00000 1995-02-28 ? 11.94109 This does not satisfy my constraints because: 1) the first month of data should have become NaN or some other special value as there is not a full month of previous data available. I think this is caused by the last() function which simply returns the available data if the requested amount of data is greater than the available amount of data. 2) the results for the second month of data are wrong. For instance look at the result for 1995-02-06 which is 81.21424%. This should have been 0%. The last months min() is 15245 (from 1995-02-06), the max() is 44580 (from element 1995-01-06) so it should yield 0%. ?From analyzing the results I get the impression that the last() function
is
not suited for a "vectorized" solution but I'm not really sure... I also had a look at runMin() and runMax() from the TTR package, but you can't specify a calendar range with these functions as you can with last() and first() from the xts package. Now my question is: am I doing something wrong here or do you know another vectorized function that satisfies my constraints? Kind regards, -Mark- ? [[alternative HTML version deleted]]
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.
? ? ? ?[[alternative HTML version deleted]]
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.
Aleks Clark