Problems parallelizing glmnet
I can tell you from my testing with caret that there is considerable speedup using foreach. See Figure 3 of http://cran.r-project.org/web/packages/caret/vignettes/caretTrain.pdf Of course it is model dependent but I have yet to see it slow down computations (but I' sure it is possible). Max On Thu, Sep 6, 2012 at 5:09 PM, Peter Langfelder
<peter.langfelder at gmail.com> wrote:
On Thu, Sep 6, 2012 at 1:58 PM, Zachary Mayer <zach.mayer at gmail.com> wrote:
In this case, each iteration of the function is very quick:
system.time(summary(lm(y ~ x[,1]))$coefficients[2,4])
user system elapsed 0.01 0.00 0.02 And you are doing 10,000 iterations, so overhead matters a lot. In the glmnet problem, each iteration of the function is very slow, and you are doing 8 iterations, so overhead doesn't matter at all. Finally, I suspect that using the doMC foreach backend will improve things considerably, but I can't currently test that.
FWIW, the foreach construct itself (without any parallel backend) is quite slow and I would not use to loop over a large number of quick calculations. Peter
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
Max