Hi Mike! I don't have experience with neural networks, but with other machine learning techniques. Whatever your approach will be, there are some important things to know in order to use machine learning in a time series context. 1) You need to check the model's potential for generalization by calculating out-of-sample error measures. A test sample following the training sample in time is usually a harder test than cross-validation. That's general machine learning business. 2) You need to calculate your models on the returns (differences) of the series rather than on the series itself. The latter is usually integrated, has increasing error variance and is therefore not iid distributed. My experience is that failing to difference the data series will usually result in extremely good in-sample fit and inexistent out-of-sample generalization. The literature on ARIMA covers this topic and associated tests extensively. 3) Assume that the data-generating process (and therefore model error) is not stable over time. You can check this by doing rolling analyses of correlations between criterion and predictors, respectively rolling model application and error assessment. (Have a look at the "rollapply"-function for doing this.) A way to check for this more formally is to draw bootstrap samples of your focus quantities for two time intervals and test for differences - rolling may be overly time consuming here. Hope this helps! Regards, Gero
Neural Networks and R
1 message · Gero Schwenk