How to check if %dopar% really run parallel?
Definitely works on an 8 core (new shiny toy, oh my!)... I have a return
series for about 40 instruments dating back to 2000. Before getting
snow/foreach/dopar to work, I previously would run the command:
chart.VaRSensitivity(R, methods=c("HistoricalVaR", "ModifiedVaR",
"GaussianVaR"), clean="geltner", colorset=bluefocus, lwd=2)
This took a bit of time to go through all the instruments and generate a
VaR sensitivity graph
This code, sped that process up significantly on a single 8 core
machine, to less than 30 seconds by my estimate, code:
#define the parallelization function
run.sens <- function(R) {
library(PerformanceAnalytics)
png(file=paste("VAR-Sens-",R,".png", sep=""), width=500,
height=500)
chart.VaRSensitivity(R, methods=c("HistoricalVaR",
"ModifiedVaR", "GaussianVaR"), clean="geltner", colorset=bluefocus, lwd=2)
dev.off()
}
#let?s do it, using the instrument returns
foreach(R=MyGlobalInstruments.returns) %dopar% run.sens(R)
As some have mentioned, be careful what you choose to parallelize. This
particular example does *not* work well across networked clusters due to
the fact that I'm creating a .png file for each instrument. It *does*
however make sense to run it across the full 8 cores available to me (or
X cores is fine, I do the same routine on a 4 core Linux box) on the
local machine.
User System Elapsed
Before 722.03 1.57 763.18
After 0.04 0.25 572.01
HTH,
cedrick
On 5/4/2010 9:31 AM, Mario Valle wrote:
*BIG RED FACE* I'm ashamed of myself, that's was the error! A small, stupid pair of parenthesis missing. Now the parallel version is faster than the serial one as it should. (serial: 57.41, parallel 2 cores: 39.31) Thanks to Stephen and all. mario Stephen Weston wrote:
There is a mistake. Rather than:
times(10000) %dopar% fun
you should write:
times(10000) %dopar% fun()
On my machine, "fun" executes in about 0.4 seconds, so executing
it 10,000 times should take over an hour to execute. Your error turned
a real program into a toy program. The error also resulted in more
communication, since now the function itself is being returned by the
workers.
When I ran your benchmark on my machine with 100, rather than 10,000
tasks, I got the following results:
user system elapsed
43.573 0.191 43.823
user system elapsed
0.093 0.007 24.890
That's not so bad.
- Steve
On Tue, May 4, 2010 at 12:22 AM, Mario Valle<mvalle at cscs.ch> wrote:
Is there any way to check that %dopar% really runs parallel?
The following code (on a dual core laptop running windows+R 2.11.0pat and on
Linux+R2.11.0) runs %dopar% more slowly than the same %do% code.
BTW, if you see any obvious mistake in the code...
Thanks!
mario
library(doSNOW)
library(foreach)
fun<- function() for(q in 1:1000000) sqrt(3)
system.time(times(10000) %do% fun, gcFirst = TRUE)
# user system elapsed
# 5.74 0.01 6.24
cl<- makeCluster(2, type = "SOCK")
registerDoSNOW(cl)
system.time(times(10000) %dopar% fun, gcFirst = TRUE)
# user system elapsed
# 7.89 0.19 9.01
stopCluster(cl)
--
Ing. Mario Valle
Data Analysis and Visualization Group |
http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc