Message-ID: <CAP01uRma-jt-1v_eNzZYCcrPtGRr9Vw4kDTkJ2T3Xr5BBmUk9A@mail.gmail.com>
Date: 2011-11-04T13:24:23Z
From: Gabor Grothendieck
Subject: Rolling through fixed-length time windows
In-Reply-To: <CABLRk0FAXDBpY5_wA-4Sciunxy1+vGB-k2s8aGiEkSw4fd1PFg@mail.gmail.com>
On Fri, Nov 4, 2011 at 9:09 AM, Matthew Clegg <matthewcleggphd at gmail.com> wrote:
> Hello R-Sig-Finance members:
>
> I was wondering if anyone has contributed functions that are similar
> to the zoo roll* functions but which operate on fixed-length time
> windows? ?For example, suppose I have a zoo-based object consisting
> of the daily closing prices of a stock, and I wish to know for each
> date, what was the volatility over the succeeding 30 calendar days?
> Probably many people would settle for something like:
> ?rollapply (log(lag(P))-log(P), 21, sd, align="left") * sqrt(252)
> (where P is the price series). ?However, this is an approximation.
> Not all periods of 30 calendar days include precisely 21 trading days.
>
> This seems like an obvious enough question that I would think that it
> has been asked (and answered) many times before, but I could not find
> a reference to the recommended solution.
>
> If no one has tackled this problem before, I might try to put together
> a small library of functions that are like roll* but which operate
> on fixed time windows. ?I am including an example of one such function
> below.
>
> Matthew Clegg
>
> ztw_sum <- function (X, delta, align="right", partial=FALSE) {
> ?# Zoo Time Window Sum
> ?#
> ?# On input, X is a zoo-based numeric vector and delta is a time
> difference.
> ?# Constructs a zoo-based numeric vector of partial sums from X. ?The
> values
> ?# included in a partial sum are those whose associated timestamps are
> ?# within delta of the corresponding element from X.
> ?#
> ?# If align="right", then result[i] is a sum of those elements
> ?# X[j] such that
> ?# ? ?0 <= timestamp[i] - timestamp[j] <= delta,
> ?# where timestamp[i] is the timestamp (index) associated with the
> ?# i-th element of X. ?Conversely, if align="left", then result[i] is a
> ?# sum of those elements X[j] such that
> ?# ? ?0 <= timestamp[j] - timestamp[i] <= delta.
> ?#
> ?# Parameters:
> ?# X: ? ? ? ?A zoo-based numeric vector with a time-based index type.
> ?# delta: ? ?An object of type difftime specifying the size of
> ?# ? ? ? ? ? the time window.
> ?# align: ? ?Specifies whether the sum for a given index should
> ?# ? ? ? ? ? be computed using elements of lower timestamps ("right")
> ?# ? ? ? ? ? or higher timestamps ("left").
> ?# partial: ?If TRUE, then partial sums are computed for elements
> ?# ? ? ? ? ? at the left (respectively, right) end of the vector.
> ?#
> ?# Returns a zoo-based numeric vector of partial sums.
> ?#
> ?# Running time is O(length(X)).
>
> ?if (!inherits(X, "zoo") || !inherits(coredata(X), "numeric")) {
> ? ?stop ("X must be a numeric vector of type zoo");
> ?} else if (delta <= 0) {
> ? ?stop ("delta must be positive");
> ?} else if ((align != "left") && (align != "right")) {
> ? ?stop ("align must be from c('left', 'right')");
> ?}
>
> ?timestamp <- index(X)
> ?R <- zoo(NA, order.by = timestamp); # The result vector
> ?sum <- 0; ?# The current partial sum
>
> ?if (align == "right") {
> ? ?# Invariants:
> ? ?# ? (a) 0 < i <= j <= length(X)
> ? ?# ? (b) 0 <= timestamp(j) - timestamp(i) <= delta
> ? ?i <- 1; ?# The leftmost index in the current window
> ? ?for (j in 1:length(X)) {
> ? ? ?if (!is.na(X[j])) {
> ? ? ? ?sum <- sum + as.numeric(X[j]);
> ? ? ?}
> ? ? ?while (timestamp[j] - timestamp[i] > delta) {
> ? ? ? ?if (!is.na(X[i])) {
> ? ? ? ? ?sum <- sum - as.numeric(X[i]);
> ? ? ? ?}
> ? ? ? ?i <- i+1;
> ? ? ?}
> ? ? ?if ((i > 1) || partial) {
> ? ? ? ?R[j] <- sum;
> ? ? ?}
> ? ?}
> ?} else { # align == "left"
> ? ?# Invariants:
> ? ?# ? (a) 0 < j <= i <= length(X)
> ? ?# ? (b) 0 <= timestamp(i) - timestamp(j) <= delta
> ? ?i <- length(X); ?# The rightmost index in the current window
> ? ?for (j in length(X):1) {
> ? ? ?if (!is.na(X[j])) {
> ? ? ? ?sum <- sum + as.numeric(X[j]);
> ? ? ?}
> ? ? ?while (timestamp[i] - timestamp[j] > delta) {
> ? ? ? ?if (!is.na(X[i])) {
> ? ? ? ? ?sum <- sum - as.numeric(X[i]);
> ? ? ? ?}
> ? ? ? ?i <- i-1;
> ? ? ?}
> ? ? ?if ((i < length(X)) || partial) {
> ? ? ? ?R[j] <- sum;
> ? ? ?}
> ? ?}
> ?}
>
> ?R
> }
>
Here is a one liner (two if you count making the result into a zoo object):
> z <- zoo(1:25)
> zz <- sapply(seq_along(z), function(i) sum(z[time(z) <= time(z)[i] & time(z) > time(z)[i] - 3]))
> zoo(zz, time(z))
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
1 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 72
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com