Skip to content
Prev 302476 / 398506 Next

more efficient way to parallel

On 08/06/2012 09:41 AM, Jie wrote:
Re-write your outer loop as an lapply, then on non-Windows use 
parallel::mclapply. Or on windows use makePSOCKcluster and parLapply. I 
ended with

library(parallel)
library(MASS)
Maxi <- 10
Maxj <- 1000

doit <- function(i, Maxi, Maxj)
{
     ## initialization, not of interest
     Sigmahalf <- matrix(sample(10000, replace=TRUE),  100)
     Sigma <- t(Sigmahalf) %*% Sigmahalf
     x <- mvrnorm(n=Maxj, rep(0, 100), Sigma)
     xlist <- lapply(seq_len(nrow(x)), function(i, x) matrix(x[i,], 10), x)
     ## end of initialization

     fun <- function(x) {
         v <- eigen(x, symmetric=FALSE, only.values=TRUE)$values
         min(abs(v))
     }
     dd1 <- sapply(xlist, fun)
     dd2 <- dd1 + dd1 / sum(dd1)
     sum(dd1 * dd2)
}

 > system.time(lapply(1:8, doit, Maxi, Maxj))
    user  system elapsed
   6.677   0.016   6.714
 > system.time(mclapply(1:64, doit, Maxi, Maxj, mc.cores=8))
    user  system elapsed
  68.857   1.032  10.398

the extra arguments to eigen are important, as is avoiding unnecessary 
repeated calculations. The strategy of allocate-and-grow 
(result.vec=numeric(); result.vec[i] <- ...) is very inefficient 
(result.vec is copied in its entirety for each new value of i); better 
preallocate-and-fill (result.vec = integer(Maxi); result.vec[i] = ...) 
or let lapply manage the allocation.

Martin