smart updates and rolling windows
Brad and List, I have implemented quite a number of statistical methods in a 'moving' way. Most of these implementations rely on common sense transformations, and a few strategic approximations. What I would really like, and I am sure Brad would second, is a good reference for these sorts of calculations. For example, the updating of matrix factorizations is fairly well researched and documented, but it is not always evident when such approaches can be applied in a particular situation. Is there a good collection of real-time algorithms as applied to higher level problems? Any suggested references would be appreciated. By way of payment, I offer this small calculation that I find useful to have in my R (Splus actually) toolbox. It generalizes a number of moving operations, and often provides sufficient speedup over a raw for-loop to remove the need for further optimization. I won't give the implementation, just the idea, implementation is a few lines of C. The function is a moving matrix multiply. The signature is: mov.mat.mul <- function(X, A) and the operation is Y <- mov.mat.mul(x, A) If x is a length n vector, and A is a m x w matrix then, Y is a n x m matrix where: => Y[i,] = A %*% x[(i-w+1):i] With the right choice of A one can do moving regressions of all sorts. Also, doing: apply(mov.mat.mul(diag(win), x), 1, my.non.linear.f) is a nice way to do arbitrary moving functions and get sufficiently good performance to assess whether further improvement is warranted. To summarize, I would like to hear about good references for these sorts of calculations. My own small contribution is to suggest that moving matrix multiplication is a fundamental operation that is useful for moving computations. Matt
On Sat, 2007-09-29 at 19:14 -0700, Bradford Cross wrote:
Greetings R'ers! I have been looking for mathematics libraries for event stream processing / time series simulation. Mathematics libraries for event stream processing require two key features; 1) "smart updates" (functions use optimal update algorithms, f.ex. once mean is calculated for an event stream, the subsequent calls to the function are computed using previous values of mean rather than by brute force re-calculation), 2) "rolling calculations" (functions take a lag parameter for sample size, f.ex. mean of last 100 events.) I found a couple simple summary statistics implemented like this in the zoo package. I have also found implementations for smart updates in some other languages (apache commons math, and BOOST accumulators) but these only supports accumulated calculations, not rolling calculations. I have built libraries for this before, and I am currently working on a new version - but before I reinvent the wheel I am trying to find some folks in the community with similar interests to collaborate with. My personal use for this is financial time series analysis, so I am interested in implementing these high-performance algorithms for classical statistics, robust statistics, regression models, etc. Best! /brad [[alternative HTML version deleted]]
_______________________________________________ R-SIG-Finance at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-finance -- Subscriber-posting only. -- If you want to post, subscribe first.