Skip to content
Back to formatted view

Raw Message

Message-ID: <ADB7A283-C4EA-48F4-B1F2-AB1EF64C56BF@r-project.org>
Date: 2012-09-06T17:41:06Z
From: Simon Urbanek
Subject: Problems parallelizing glmnet
In-Reply-To: <5048E265020000F90000AFE3@gwia1.boku.ac.at>

On Sep 6, 2012, at 11:50 AM, Patrik Waldmann <patrik.waldmann at boku.ac.at> wrote:

> I want to run the cv.glmnet function with the same data (y and x) with different values on the alpha parameter determined by the number of cores, but the result is absurd. What is wrong in the code below?
> 

You're evaluating exactly the same expression on all nodes ... I don't think you intended that (you are passing the alphasplit list as alpha to all of them - I don't think that makes sense). Isn't this closer to the intention:

alphas <- seq(0, 1, length.out= cores)
out <- clusterApply(cl, alphas, function(alpha) cv.glmnet(x,y,alpha=alpha))

Cheers,
Simon



> Patrik Waldmann
> 
> x <- matrix(rnorm(2000*10000),ncol=10000)
> y <- matrix(rnorm(2000),ncol=1)
> 
> library(parallel)
> cvglmnet <- function(...) {
> library(glmnet)
> cv.glmnet(x,y,alpha=alphasplit)
> }
> system.time(cores<-detectCores())
> system.time(cl <- makeCluster(cores, methods=FALSE))
> alpha<-seq(0, 1,by=1/(cores-1))
> alphasplit<-clusterSplit(cl,alpha)
> system.time(clusterExport(cl, c("x","y","cvglmnet","alphasplit")))
> system.time(outbrlist<-clusterEvalQ(cl, cvglmnet(x,y,alphasplit)))
> system.time(totoutbr<-do.call(cbind,outbrlist))
> stopCluster(cl)
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
> 
>