Skip to content

foreach: using either %do% or %dopar% depending on condition

6 messages · Matthieu Stigler, Stephen Weston, Brian G. Peterson

#
Hi

I wish to be able to substitute %do% or %dopar% depending on whether the 
user wants or not to use the parallel feature. For this, I would do 
something like:

if(hpc) foreach(icount(5), .combine = "rbind")  %dopar%
else foreach(icount(5), .combine = "rbind")  %do%
    statement of function...

I did not succeed in doing it, I tried:

foreach(icount(5), .combine = "rbind")  %do% else  %dopar% 
{
a<-runif(1)
b<-runif(1)
c(a,b)
}

Or tried to assign outside the operator:

"%my%" <-function(obj,ex) if(hpc=="none") "%dopar%" else "%dopar%"

foreach(icount(5), .combine = "rbind")  %my% {
a<-runif(1)
b<-runif(1)
c(a,b)
}

But it fails... any idea?

Thanks a lot!!

Matthieu
#
I think you're over-complicating this.

I only use %do% while testing something, to make certain that I can debug 
cleanly, and often not even that.

I then convert *all* code to %dopar%

The user can registerDoSeq() to do sequentially on one core, and this is the 
default behavior if no parallel registerDo* engine is registered (e.g. foreach 
is installed, but none of the parallel backends, or the user has not registered 
a backend).

Regards,

    - Brian
mat wrote:

  
    
#
You could try something like this:

    > '%my%' <- if (hpc == "none") get('%do%') else get('%dopar%')

However, why not just let the user decide by either registering a parallel
backend or not?  The main purpose of the "registerDoSEQ" function is
to allow the user to say that parallel operations should be done sequentially.
The '%dopar%' function is the programmer's way to declare that the
foreach loop can be executed in parallel.  It's the user's job to declare
how the '%dopar%' should be executed.  That also simplifies the code
by not having to define and pass around yet another option to your
functions.

- Steve
On Mon, May 3, 2010 at 6:45 AM, mat <matthieu.stigler at gmail.com> wrote:
#
Thanks a lot Brian and Steve for your prompt answer!!

You noth pointed out what I'm doing is a little bit complicated... let 
me precise why I do this, and maybe you will agree?

The function for end-user is not on parallel staff, it is a test for a 
time series model, whose computations can be alleviated thanks to a 
parallel computing. Say the function is (using the names as in package 
strucchange, which is to my knowledge the only one to make an explicit 
call to foreach):

mytest<-function(x, y, optionA=c("A", "B"), hpc=c("none", "foreach")

So the rationale is that one assumes that by default the user does not 
use the parallel option, hence the %do% call (using by default %dopar% 
would be adding an unexpected warning, no?). But if the user wants, then 
we will call %dopar%

Do you think it then makes sense? Are there any recomandations / 
standardization efforts on how one should provide link to foreach or 
snow from other packages?

Thanks a lot!!!

Matthieu





Stephen Weston a ?crit :
#
Matthieu,

I think I understand what you're saying, but you still don't need it, in my 
opinion.

registerDoSEQ() does what you're describing, it instructs foreach to run 
without a parallel backend.  If no parallel backend is available, this is the 
default.

As for standardizations, just put foreach in 'Depends:' in your package 
DESCRIPTION file, as you would for any other package.  'foreach' will be 
required to install your package, and it will be loaded automatically when your 
package loads, as with any other R package.

I'll note again, that when foreach is loaded, if the user takes no explicit 
action to register a parallel backend, foreach will run sequentially, 
automatically, without any of the complications you describe, or the workaround 
suggested by Steve.

Regards,

    - Brian
mat wrote:

  
    
#
Brian G. Peterson a ?crit :
do I understand it well that you would add something like:

mytest<-function(x, y, optionA=c("A", "B"), hpc=c("none", "foreach") )
    ...
    if(hpc=="none") registerDoSEQ()
    foreach() %dopar%
    ...
I rather thought conventions in terms of naming arguments for functions 
which call parallel routines, whether there are recommandations so that 
there is consitency for the user calling parallel tasks across packages.
but it will issue a warning, right?

thanks a lot!!

Matthieu