Skip to content
Prev 1756 / 2152 Next

Updating variable names in secondary cluster set-ups

If it's not too presumptuous, could someone confirm another basic question?

Imagine code involving two tasks (task1 and task2), each involving a new
cluster (using makeCluster in snow or sfInit in snowfall), and where the
output list from task1 is used as input to task2.

Do I need to re-export the new task1 output list to the second cluster
set-up, or will it have been stored there from the original start-up?

In other words, I think, do we need to re-export libraries, variables,
etc. every time we start-up a new cluster, or only the first time?

In my playing around (see below), it seems I only need to do it once.

Thanks,
Phil

# Example code:
library(snowfall)
cpus <- 2
nreps <- 100
fun1 <- function(x) {x} # Make a list
fun2 <- function(x) {list(x=x)} # Add another list level to first list
resultsL1 <- resultsUL1 <- resultsL2 <- resultsUL2 <- NA
# Task 1 involving 1st cluster set-up
sfInit(parallel=T, cpus=cpus)
sfExportAll()
resultsL1 <- sfLapply(1:nreps, fun1)
resultsUL1 <- unlist(resultsL1)
sfStop()

# Task 2 involving 2nd cluster set-up
sfInit(parallel=T, cpus=cpus)
# sfExport("resultsL1")     # Needed? Apparently not
resultsL2 <- sfLapply(resultsL1, fun2)
resultsUL2 <- unlist(resultsL2)
sfStop()