An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-hpc/attachments/20120906/f8a5ef49/attachment.pl>
Problems parallelizing glmnet
4 messages · Patrik Waldmann, Zachary Mayer, Simon Urbanek
Hasn't the caret package already solved this problem? You can pass the tuneGrid parameter to specify your custom alpha and lambda sequence, an the trainControl parameter to specify what kind of cross-validation you wish to use. Caret uses foreach, so you can register a parallel backend of your choice. Sent from my iPhone
On Sep 6, 2012, at 11:56 AM, Patrik Waldmann <patrik.waldmann at boku.ac.at> wrote:
I want to run the cv.glmnet function with the same data (y and x) with different values on the alpha parameter determined by the number of cores, but the result is absurd. What is wrong in the code below?
Patrik Waldmann
x <- matrix(rnorm(2000*10000),ncol=10000)
y <- matrix(rnorm(2000),ncol=1)
library(parallel)
cvglmnet <- function(...) {
library(glmnet)
cv.glmnet(x,y,alpha=alphasplit)
}
system.time(cores<-detectCores())
system.time(cl <- makeCluster(cores, methods=FALSE))
alpha<-seq(0, 1,by=1/(cores-1))
alphasplit<-clusterSplit(cl,alpha)
system.time(clusterExport(cl, c("x","y","cvglmnet","alphasplit")))
system.time(outbrlist<-clusterEvalQ(cl, cvglmnet(x,y,alphasplit)))
system.time(totoutbr<-do.call(cbind,outbrlist))
stopCluster(cl)
[[alternative HTML version deleted]]
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-hpc/attachments/20120906/ec375ecc/attachment.pl>
On Sep 6, 2012, at 11:50 AM, Patrik Waldmann <patrik.waldmann at boku.ac.at> wrote:
I want to run the cv.glmnet function with the same data (y and x) with different values on the alpha parameter determined by the number of cores, but the result is absurd. What is wrong in the code below?
You're evaluating exactly the same expression on all nodes ... I don't think you intended that (you are passing the alphasplit list as alpha to all of them - I don't think that makes sense). Isn't this closer to the intention: alphas <- seq(0, 1, length.out= cores) out <- clusterApply(cl, alphas, function(alpha) cv.glmnet(x,y,alpha=alpha)) Cheers, Simon
Patrik Waldmann
x <- matrix(rnorm(2000*10000),ncol=10000)
y <- matrix(rnorm(2000),ncol=1)
library(parallel)
cvglmnet <- function(...) {
library(glmnet)
cv.glmnet(x,y,alpha=alphasplit)
}
system.time(cores<-detectCores())
system.time(cl <- makeCluster(cores, methods=FALSE))
alpha<-seq(0, 1,by=1/(cores-1))
alphasplit<-clusterSplit(cl,alpha)
system.time(clusterExport(cl, c("x","y","cvglmnet","alphasplit")))
system.time(outbrlist<-clusterEvalQ(cl, cvglmnet(x,y,alphasplit)))
system.time(totoutbr<-do.call(cbind,outbrlist))
stopCluster(cl)
[[alternative HTML version deleted]]
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc