An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130701/0bd78aee/attachment.pl>
Parallel processing random 'save' error
2 messages · Rguy, William Dunlap
Error in checkForRemoteErrors(val) : one node produced an error: (converted from warning) 'D:\_pgf\quantile_analysis2_f13\_save\dbz084_nump48\bins' already exists
That warning looks like it comes from dir.create(). Do you have
code that looks like:
if (!file.exists(tempDir)) {
dir.create(tempDir)
}
If so that could be the problem. The directory may not exist when file.exists()
is called but by the time dir.create is called another process may have created
it. Try replacing such code with
suppressWarnings(dir.create(tempDir))
if (!isTRUE(file.info(tempDir)$isdir)) {
stop("Cannot create tempDir=", tempDir)
}
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
Of Rguy
Sent: Monday, July 01, 2013 1:07 AM
To: r-help at r-project.org
Subject: [R] Parallel processing random 'save' error
Platform: Windows 7
Package: parallel
Function: parLapply
I am running a lengthy program with 8 parallel processes running in main
memory.
The processes save data using the 'save' function, to distinct files so
that no conflicts writing to the same file are possible.
I have been getting errors like the one shown below on a random basis,
i.e., sometimes at one point in the execution, sometimes at another,
sometimes no error at all.
I should note that the directory referred to in the error message
( 'D:\_pgf\quantile_analysis2_f13\_save\dbz084_nump48\bins') contains, as I
write, 124 files saved to it by the program without any error; which
underscores the point that most of the time the saves occur with no problem.
Error in checkForRemoteErrors(val) :
one node produced an error: (converted from warning)
'D:\_pgf\quantile_analysis2_f13\_save\dbz084_nump48\bins' already exists
Enter a frame number, or 0 to exit
1: main_top(9)
2: main_top.r#26: eval(call_me)
3: eval(expr, envir, enclos)
4: quantile_analysis(2)
5: quantile_analysis.r#69: run_all(layr, prjp, np, rules_tb, pctiles_tb,
parx, logdir, logg)
6: run_all.r#73: parLapply(cl, ctrl_all$vn, qa1, prjp, dfr1, "iu__bool",
parx, logdir, tstamp)
7: do.call(c, clusterApply(cl, x = splitList(X, length(cl)), fun = lapply,
fun, ...), quote = TRUE)
8: clusterApply(cl, x = splitList(X, length(cl)), fun = lapply, fun, ...)
9: staticClusterApply(cl, fun, length(x), argfun)
10: checkForRemoteErrors(val)
11: stop("one node produced an error: ", firstmsg, domain = NA)
12: (function ()
{
error()
utils::recover()
})()
Following the latest error I checked the system's connections as follows:
Browse[1]> showConnections()
description class mode text isopen can read can
write
3 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
4 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
5 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
6 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
7 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
8 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
9 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
10 "<-LAPTOP_32G_01:11741" "sockconn" "a+b" "binary" "opened" "yes"
"yes"
Browse[1]>
It seems that the parallel processes might be sharing the same
connection--or is it that they are utilizing connections that have the same
name but are actually distinct because they are running in parallel?
If the connections are the problem, how can I force each parallel process
to use a different connection?
If the connections are not the problem, then can someone suggest a
diagnostic I might apply to tease out what is going wrong? Or perhaps some
program setting that I may have neglected to consider?
Thanks in advance for your help.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.