Skip to content

Incorrect SVD Calculation

6 messages · Patrick Burns, Pierre Lapointe, Bob +1 more

#
R-Financiers,
For any of you who, like us, use SVD for risk modeling and stat arb trading, we've discovered what I think is a very serious bug in R/LAPACK's SVD calculation. Since this bug has the potential to create bogus risk models, I thought this audience might be interested in this post.

Specifically, for many matrices that are not full rank and have a few small eigenvalues (much like covariance from stock returns!), SVD may return completely bogus and impossible results: for a correlation matrix the sum of the singular values will be many times too large and the 'u' and 'v' matrix will not be even close to orthonormal. For a random matrix with certain distributions of eigenvalues (first discovered using a covariance from 1-minute bar stock returns), SVD produces bogus results 20%-50% of the time.

I've posted the bug report with an example here:
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14962

For any of you using risk models or stat arb that use SVD, you may want to see if this error affects you.

Also, (selfishly) I wanted to post this out to fellow practitioners already knew about this problem and were using a better algorithm (I noticed there are more than one LAPACK svd algorithm). The slow method of SVD calculation by eigen(crossprod(x)) and eigen(tcrossprod(x)) still works fine (but it's very slow).

Last, here's a simple 'svd' replacement function that performs a rudimentary check on the output of svd. It fails if 'u' is not orthonormal. Generally 'u', 'd', and 'v' all seem to give wrong answers at the same time, but no guarantees that problems with 'v' and 'd' won't slip through.

svd <- function(...) {
    x <- base::svd(...)
    if (!is.complex(x$u) && any(abs(colSums(x$u^2)-1) > 1e-3))
        stop("svd gave incorrect result")
    x
}

Suggestions, comments appreciated.
--Robert

Robert McGehee, CFA
Geode Capital Management, LLC
One Post Office Square, 28th Floor | Boston, MA | 02109
Direct: (617)392-8396

This e-mail, and any attachments hereto, are intended fo...{{dropped:11}}
#
Thanks for highlighting that there is risk where
I hadn't expected any.

I've seen a different weird phenomenon in which
R's 'svd' throws an error because of non-convergence
in the algorithm.  Changing the order of the rows
makes it go away.

Pat
On 28/06/2012 16:52, Robert Harlow wrote: