Neural Networks and R - R-SIG-Finance

Tue, Mar 23, 2010 3:59 PM #
Hi Mike!
I don't have experience with neural networks, but with other machine 
learning techniques. Whatever your approach will be, there are some 
important things to know in order to use machine learning in a time 
series context.

1) You need to check the model's potential for generalization by 
calculating out-of-sample error measures. A test sample following the 
training sample in time is usually a harder test than cross-validation. 
That's general machine learning business.

2) You need to calculate your models on the returns (differences) of the 
series rather than on the series itself. The latter is usually 
integrated, has increasing error variance and is therefore not iid 
distributed. My experience is that failing to difference the data series 
will usually result in extremely good in-sample fit and inexistent 
out-of-sample generalization. The literature on ARIMA covers this topic 
and associated tests extensively.

3) Assume that the data-generating process (and therefore model error) 
is not stable over time. You can check this by doing rolling analyses of 
correlations between criterion and predictors, respectively rolling 
model application and error assessment. (Have a look at the 
"rollapply"-function for doing this.) A way to check for this more 
formally is to draw bootstrap samples of your focus quantities for two 
time intervals and test for differences - rolling may be overly time 
consuming here.

Hope this helps!
Regards, Gero