mclapply: problem writing into a texfile within a loop
On Jul 10, 2012, at 9:28 AM, Mauricio Zambrano-Bigiarini wrote:
Dear list,
While using the mclapply function provided by the multicore package, I notice that more lines are written than expected when writing the outputs of the simulations into a textfile, AFTER the call to 'mclapply'.
At the other hand, by using the 'parallel' package I do not have any "additional" outputs.
Below you can find a reproducible example:
--------START--------
fn <- function(x) {
n <- length(x)
return(1 + (1/4000) * sum(x^2) - prod(cos(x/sqrt(seq(1:n)))))
}
fn1 <- function(i, x) fn(x[i,])
nr <- 50 ; X <- matrix(rnorm(1000), ncol=50, nrow=nr)
#######################
# multicore: mclapply #
fname <- paste("~/logfile_multicore.txt", sep="")
TextFile <- file(fname , "w+")
for (iter in 1:3) {
library(multicore)
set.seed(100)
unlist(multicore::mclapply(1:nr, FUN=fn1, x=X, mc.cores=6))
for (i in 1:2) {
writeLines(c("iter:", as.character(iter), " ; i:", as.character(i) ), TextFile, sep=" ")
writeLines("", TextFile)
} # FOR i end
} # FOR iter end
close(TextFile)
# output:
#iter: 1 ; i: 1 # it should not be here
#iter: 1 ; i: 2 # it should not be here
#iter: 1 ; i: 1 # it should not be here
#iter: 1 ; i: 2 # it should not be here
#iter: 1 ; i: 1 # it should not be here
#iter: 1 ; i: 2 # it should not be here
#iter: 2 ; i: 1 # it should not be here
#iter: 2 ; i: 2 # it should not be here
#iter: 1 ; i: 1
#iter: 1 ; i: 2
#iter: 2 ; i: 1
#iter: 2 ; i: 2
#iter: 3 ; i: 1
#iter: 3 ; i: 2
############
# parallel #
fname <- paste("~/logfile_parallel.txt", sep="")
TextFile <- file(fname , "w+")
for (iter in 1:3) {
cl <- parallel:::makeCluster(6)
set.seed(100)
parApply(cl=cl,X,1,fn)
stopCluster(cl)
for (i in 1:2) {
writeLines(c("iter:", as.character(iter), " ; i:", as.character(i) ), TextFile, sep=" ")
writeLines("", TextFile)
} # FOR i end
} # FOR iter end
close(TextFile)
# output:
#iter: 1 ; i: 1
#iter: 1 ; i: 2
#iter: 2 ; i: 1
#iter: 2 ; i: 2
#iter: 3 ; i: 1
#iter: 3 ; i: 2
-----------END-----------------
The same results are obtained if the call to 'multicore::mclapply' is replaced by 'parallel::mclapply'.
sessionInfo()
R version 2.15.0 (2012-03-30) Platform: x86_64-redhat-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C [3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8 [5] LC_MONETARY=en_GB.utf8 LC_MESSAGES=en_GB.utf8 [7] LC_PAPER=C LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C attached base packages: [1] parallel splines stats graphics grDevices utils datasets [8] methods base other attached packages: [1] multicore_0.1-7 Do you know if is it any way of avoiding the writing of additional lines after calling mclapply ?
Simply add flush(TextFile) after the second writeLines(). Since you're forking the processes without flushing the buffers, the buffers get flushed as the processes exit and thus creating each one copy of the unflushed output for each process. Obviously, using makeCluster() does't have that effect since it creates new, independent processes. Cheers, Simon
Thanks in advance, Mauricio Zambrano-Bigiarini -- ==================================================== Water Resources Unit Institute for Environment and Sustainability (IES) Joint Research Centre (JRC), European Commission webinfo : http://floods.jrc.ec.europa.eu/ ==================================================== DISCLAIMER:\ "The views expressed are purely those of th...{{dropped:10}}
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc