Parallel code using parLapply
On Fri, Dec 21, 2012 at 10:42 AM, Chris Hergarten <chegaga at yahoo.com> wrote:
Dear R-users
I was running into problems with my R code trying to run clh sampling (clhs package) in parallel mode (=on various data sets simultaneously).
Here is the code (which I developed with some help:)):
******************************************
library("clhs")
library("snow")
a <- as.data.frame(replicate(1000, rnorm(20)))
b <- as.data.frame(replicate(1000, rnorm(20)))
c <- as.data.frame(replicate(1000, rnorm(20)))
d <- as.data.frame(replicate(1000, rnorm(20)))
abcd <- list(a, b, c, d)
cl <- makeCluster(4)
results <- parLapply(cl,
X = abcd,
FUN = function(i) {
clhs(x = i, size = round(nrow(i) / 5), iter = 2000, simple = FALSE)
},
)
stopCluster(cl)
******************************************
Before running the last line, R is throwing an error: "Error in length(x) : 'x' is missing". Any ideas what I am doing wrong and how to improve?
Loading clhs on the primary does not automatically load it on the workers. Try: clusterEvalQ(cl, library(clhs)) -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com