Skip to content

Excessive data needed for volatility{TTR} calculation?

7 messages · Joshua Ulrich, James

#
Hi,

I have been using the volatility function from the TTR package and I
noticed something that I thought was a bit unusual. I expected that I
should be able to calculate the default 10-day volatility using the
close estimator starting with 10 or maybe 11 days of data.  That's not
what I found.  It appears that 18 days of data is necessary to
calculate a 10-day volatility.  For example:
[1] "SPY"
Error in `[.xts`(x, beg:(n + beg - 1)) : subscript out of bounds
Error in `[.xts`(x, beg:(n + beg - 1)) : subscript out of bounds
[,1]
2011-05-03         NA
2011-05-04         NA
2011-05-05         NA
- edited for brevity -
2011-05-23         NA
2011-05-24         NA
2011-05-25         NA
2011-05-26 0.09481466

Stranger still (at least to me), it appears that 38 days worth of data
is necessary to start calculating a 20-day volatility.
Error in `[.xts`(x, beg:(n + beg - 1)) : subscript out of bounds
[,1]
2011-04-04        NA
2011-04-05        NA
2011-04-06        NA
 - edited for brevity -
2011-05-23        NA
2011-05-24        NA
2011-05-25        NA
2011-05-26 0.1088309

58 days of data is necessary for a 30-day volatility calculation.
why so much additional data is needed to calculate the volatility.
Does anybody have an idea of why so much additional data is necessary?
 Thanks.

James

R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
#
Hi again,

I've been trying to figure out the problem and I believe there is a
problem with the vectorization in volatility, which results in the
volatility calculations for the close to close method being
inaccurate.  I believe the issue is with this part of line 14.

runSum((r - rBar)^2, n - 1)

The first 9 r all have to be differenced against the same rBar, not a
running sum of rBars.  I believe a better way to accomplish this would
be:

s <- sqrt(N) * runSD(r, (n -1))
function (OHLC, n = 10, calc = "close", N = 260, ...)
{
    OHLC <- try.xts(OHLC, error = as.matrix)
    calc <- match.arg(calc, c("close", "garman.klass", "parkinson",
        "rogers.satchell", "gk.yz", "yang.zhang"))
    if (calc == "close") {
        if (NCOL(OHLC) == 1) {
            r <- ROC(OHLC[, 1], 1, ...)
        }
        else {
            r <- ROC(OHLC[, 4], 1, ...)
        }
        rBar <- runSum(r, n - 1)/(n - 1)
        s <- sqrt(N/(n - 2) * runSum((r - rBar)^2, n - 1))       # line 14
    }

Please let me know if this makes sense to anyone else, or if I'm
mistaken.  Thanks.

James
On Fri, May 27, 2011 at 6:52 PM, J Toll <jctoll at gmail.com> wrote:
#
Hi James,
On Fri, May 27, 2011 at 9:33 PM, J Toll <jctoll at gmail.com> wrote:
Thanks for digging into this.  I've recently received one or two
emails about this off-list, but have not had time to look into the
issue.

I think your solution will work, but using 'n' instead of 'n-1'.  The
code below shows the same results using your solution and a formula
similar to the one found here (which I mis-interpreted when I
originally wrote the function):
http://web.archive.org/web/20081224134043/http://www.sitmo.com/eq/172

set.seed(21)
N <- 260
n <- 100
r <- rnorm(n)/100
last(sqrt(N) * runSD(r, n))
sqrt(N/(n-1)*sum((r-mean(r))^2))

Thanks!
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com
#
On Fri, May 27, 2011 at 10:39 PM, Joshua Ulrich <josh.m.ulrich at gmail.com> wrote:
Hi Joshua,

Thanks for replying and confirming my suspicions. However, I'm curious
why you would use 'n' rather than 'n-1'.  My thinking is that a 10-day
volatility (n = 10) is calculated as the annualized standard deviation
of 9 (n - 1) price returns (i.e. ln(p1/p0), ROC()).  The sample
standard deviation of 9 price returns would be the sum of the squared
deviations divided by 9 - 1, or n - 2.  Therefore, I believe your line

sqrt(N / (n - 1) * sum((r - mean(r)) ^ 2))

should actually be

sqrt(N / (n - 2) * sum((r - mean(r)) ^ 2))

I've been double-checking my work and went ahead and calculated 10 and
20-day vols by hand and I'm pretty sure

s <- sqrt(N) * runSD(r, (n - 1))

is correct, unless your defining 10-day volatility as 11 days of data
and 10 price returns.  Please let me know otherwise. Thanks.

James
#
Hi James,
On Fri, May 27, 2011 at 11:25 PM, J Toll <jctoll at gmail.com> wrote:
Actually, because the first return in the moving window would always
be NA, it should be:
sqrt(N/(n-2)*sum((r[-1]-mean(r[-1]))^2))

which yields the same result as:
last(sqrt(N) * runSD(r, n-1))
After getting some sleep, it's clear that your initial solution (n-1)
is correct.

Your patch will be on R-forge shortly.  Many thanks again!

Best,
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com
#
Joshua,
On Sat, May 28, 2011 at 7:13 AM, Joshua Ulrich <josh.m.ulrich at gmail.com> wrote:
I've been trying both lines of code and unfortunately I'm not getting
the same results.  The first line seems to only work properly for me
in those instances when NCOL(OHLC) = n.  For the more common situation
where NCOL(OHLC) > n, you would want a rolling window of vol
calculations.  I'm still thinking that the code should be:

s <- sqrt(N) * runSD(r, (n - 1))

As a frame of reference, I believe the output should be:
[,1]
2011-05-20 0.1206382
2011-05-23 0.1181380
2011-05-24 0.1095445
2011-05-25 0.1069024
2011-05-26 0.1068434
2011-05-27 0.1038008

I've manually calculated the value for 2011-05-27 using a spreadsheet
to confirm the value. I believe the other values to be correct also.
You may want to hold off on a patch in the short term.  I still think
there might be an error in there.  I'm sorry to be such a nuisance
about this, but thanks so much for your help.

James
#
Hi James,
On Sat, May 28, 2011 at 9:44 AM, J Toll <jctoll at gmail.com> wrote:
<snip>
My last email wasn't very clear; I apologize.

I still agree with your suggestion and plan to use it as a patch.  The
first line in my prior email was to illustrate (and convince myself)
that your solution matched the formula here:
http://web.archive.org/web/20081224134043/http://www.sitmo.com/eq/172

And it only matches when NROW(OHLC) == n because your solution
operates on a rolling window and my first line operates on everything.
 Try something like this:

n <- 5
R <- cumprod(1+r)
FUN <- function(x) {
  r <- ROC(x); n <- NROW(x)
  sqrt(252/(n-2)*sum((r-mean(r, na.rm=TRUE))^2, na.rm=TRUE))
}
head(sqrt(N) * runSD(ROC(R), n-1),15)
head(rollapply(R, n, FUN, align="right", fill=NA),15)
n <- 10
head(sqrt(N) * runSD(ROC(R), n-1),15)
head(rollapply(R, n, FUN, align="right", fill=NA),15)

Sorry for the confusion.

Best,
--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com