Parallel multi-core processing for complex R functions
On Jul 14, 2013, at 11:13 PM, Alistair Perry wrote:
As you know R doesn't explicitly allow parallel processing on multiple cores. I have two complex functions called "test" and "rcc". I did not write these codes. The basic premise of the function code "test" is to calculate the rich-club members of a weighted matrix (a certain group of nodes [one node = one matrix definition] are more connected to each other than other nodes- see http://toreopsahl.com/tnet/weighted-networks/weighted-rich-club-effect/). In short, the code is designed to calculate the strength of the ties between prominent nodes. However, to test whether there is a significant effect, the input matrix (argument "net" in the code "test") must be compared to a randomized matrix (the links of the input matrix are reshuffled) reshuffled 1000 times. The randomized matrix is created by the function code "rcc". This code is embedded in the function code "test". You can set how many times the input matrix is to be reshuffled using the argument "NR" (i.e NR = 1000 - means input matrix will be reshuffled 1000 times) The issue here is that as R does not explicitly allow multi-core parallel processing, so the computation for one matrix (500x500) using the "test" code can take over a week. I am using a quad-core processer with a linux OS, but only one core is being used. I am aware that there is the base package "parallel" and "mclapply" to multi-thread the function, but these commands require an input argument ("X"), along with the function I wish to process using all cores. However, the functions I am using require an input (equivalent to X) within the argument variables, so it would mean setting the input matrix twice as an input argument variable.
That's not true, you can use the matrix directly from the parallel code, because everything is shared. If you wrote your code using apply instead of a loop, you would have seen that all you need to do is to replace lapply with mclapply: You have
rphi <- matrix(data = 0, nrow = nrow(ophi), ncol = NR)
for (i in 1:NR) {
rnet <- rcc(net, option = reshuffle)
rphi[, i] <- phi(rnet)
}
which can be simplified to
rphi <- sapply(seq.int(NR), function(i) phi(rcc(net, option=reshuffle)))
and thus the parallel version is simply
rphi <- simplify2array(mclapply(seq.int(NR), function(i) phi(rcc(net, option=reshuffle))))
Cheers,
Simon
Does anyone have an idea of how I could multi-thread the code. Particularly, the section of the code where the matrix is reshuffled, so the matrix could be reshuffled 250 times on one core (if there were 4 cores)? This would speed up the computation dramtically. The codes are available to be viewed on http://stackoverflow.com/questions/17646190/parallel-and-multicore-processing-for-complex-r-function [[alternative HTML version deleted]]
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc