Skip to content
Prev 1182 / 2152 Next

Problems with Exporting Functions with Foreach/DoSNOW

Hi Steve,

Thanks for your reply.

That all makes sense. I have noticed that functions in packages get loaded just fine, and I'm intending to go that route at some point.

I think I will look into clusterExport for the time being - it's simpler for development.

Thank you,
Reuben

-----Original Message-----
From: Stephen Weston [mailto:stephen.b.weston at gmail.com] 
Sent: Monday, November 28, 2011 1:59 PM
To: Reuben Bellika
Cc: r-sig-hpc at r-project.org
Subject: Re: [R-sig-hpc] Problems with Exporting Functions with Foreach/DoSNOW

Hi Reuben,

The problem is that main.fun expects to find helper1 and helper2 in the global environment, but the foreach .export argument is exporting them to a temporary environment that is only used for the duration of that foreach operation.  Thus, foreach is able to find main.fun, but main.fun can't find helper1 or helper2 since main.fun is only looking in the global environment and the currently loaded packages.

You don't have this problem with doMC, since the workers are dynamically forked by the multicore package, so they actually have helper1 and helper2 defined in the global environment, just like your R session.  doSNOW tries to help by automatic and manual exporting tricks, but it's far from perfect, as your example aptly demonstrates.

To avoid this problem, foreach would either have to export the functions to the global environment, or modify main.fun to include the temporary environment in its scope.  Both of those options seem to have problems.

In general, I prefer to put these kinds of functions into a package.  Then you just need to use the foreach .packages argument to load that package on the workers.

A simple alternative is to use the snow clusterExport function to export main.fun, helper1, and helper2 to the snow workers.
Then they really will be defined in the global environment as main.fun expects.  That isn't portable between different parallel backends, of course.  That's why I think that putting them in a package is a better option.

Here's a modified version of your example that uses the clusterExport function to fix the problem:


library(foreach)
library(snow)
library(doSNOW)

# Two helper functions
helper1 <- function(i) { return(i + 1) }
helper2 <- function(i) { return(i + 2) }

# The main function called once each loop main.fun <- function(i) {
   # Call two other functions
   return(helper1(i) + helper2(i))
}

# Compute the values (odd numbers from 5 to 23) using a for loop compute.local <- function() {
   values <- c()
   for (i in 1:10)
   {
       values <- c(values, main.fun(i))
   }

   return(values)
}

# Compute the values (odd numbers from 5 to 23) using a foreach loop compute.cluster <- function() {
   values <- foreach(i = 1:10,
                     .combine = "c") %dopar%
   {
       main.fun(i)
   }

   return(values)
}

# Start the cluster and register with doSNOW (node names are just examples) cl <- makeCluster(2, type = "SOCK") clusterExport(cl, c("main.fun", "helper1", "helper2"))
registerDoSNOW(cl)

print(compute.local())
print(compute.cluster())

# Stop the cluster
stopCluster(cl)


And thanks for the excellent test program.

- Steve
On Mon, Nov 28, 2011 at 3:36 PM, Reuben Bellika <reuben at deltamotion.com> wrote: