Error: ReadItem: unknown type 98, perhaps written by later version of R
On 08/23/2012 11:46 AM, Aldi Kraja wrote:
Thanks to Martin who send an email off the list with among others the following: "Probably the file is being corrupted on disk, perhaps it has not yet been closed before reading is attempted, or some other obscure file system issue. Probably the key part in your script is 'sleep', which probably slows disk access enough for your file system to recover integrity." His note made me think that something can be with the programs running in parallel in the same processing server: There are up to 8 slots for running in parallel 8 jobs in a Linux server. Many servers are available. Each job is working with unique file names for R and the corresponding out files, and also all the objects inside the each R job are defined unique with their own indices, and I finish the program with q(); n for not saving the R space at the end of each process. Let me draw a parallel thinking with SAS jobs. If I run a 8 parallel job in SAS, SAS although it will use the /tmp directory of that processing server, each job will have its own pid and they are built unique in their run and uniquely saving temp data and removed at the end. So 8 parallel jobs in a server and more from different servers, they do not corrupt each others data. Now what happens with R? Eight jobs are in parallel, are they processed in unique spaces of the /tmp harddrive, or all write to ~/.RData ? If
yes, they'll all write to ~/.RData (actually, .RData in the current directory, see ?Startup).
the last happens although they are uniquely defined, it is quite possible that in the ~/.RData something is happening with reported error: Error: ReadItem: unknown type 98, perhaps written by later version of R Execution halted Probably --no-restore --no-save may help, but isn't that dangerous if
yes, that's the right thing to do.
all programs (if I have 1000 of them) write all to ~/.RData? So how R handles parallel jobs of the same user in regard to the R invocation and space used for temporary calculations. Do these parallel batch R jobs
Each independent R process gets its own temporary directory, see the output of tempdir(). mtmorgan at precise-mtmorgan:$ R --silent --vanilla -e "tempdir()" > tempdir() [1] "/tmp/RtmpuZ7IkT" mtmorgan at precise-mtmorgan:$ R --silent --vanilla -e "tempdir()" > tempdir() [1] "/tmp/RtmpXnKIVO" Hmm, but in the 'parallel' package the child processes inherit from the parent. > unique(unlist(mclapply(1:4, function(i) tempdir(), mc.cores=4))) [1] "/tmp/Rtmpkr5w6j"
see each other in the same space or are they for sure in independent temporary subdirs? Thanks, Aldi On 8/22/2012 3:47 PM, Aldi Kraja wrote:
Hi, Here is a solution for this type of error: Error: ReadItem: unknown type 98, perhaps written by later version of R Execution halted Created a script file under the directory where the pgm-s and data reside and ran there ./script.sh where script.sh had the following lines R CMD BATCH ./dc19at1.R ./dc19at1.out sleep 3 R CMD BATCH ./dc19at2.R ./dc19at2.out sleep 3 ... etc The programs ran with no problem. So what I did is eliminated the full path let's say R CMD BATCH /a/b/c/dc19at1.R /a/b/c/dc19at1.out which did not work through bsub or at the command line in a remote server. I am not sure what is the "type 98 error" meaning in R? Anybody knows where the R error types are described? TIA, Aldi On 8/21/2012 10:09 AM, Aldi Kraja wrote:
Hi, I am running a large number of jobs (thousands) in parallel (linux OS 64bit), R version 2.14.1 (2011-12-22), Platform: x86_64-redhat-linux-gnu (64-bit). Up to yesterday everything ran fine with jobs in several blocks (block1, block2 etc) of submission. They are sent to an LSF platform to handle the parallel submission. Today I see that only one of the blocks (the 19) has not finished correct: It reports in the out file: Error: ReadItem: unknown type 98, perhaps written by later version of R Execution halted Checking through google one had recommended rm ~/.RData I applied it, but the run again fails, when submitting through SAS for block 19. [SAS in macro lang.] %sysexec bsub R CMD BATCH &fullpath./dc19at&j..R &fullpath.dc19at&j..out ; [SAS ] %sysexec sleep 3 ; <looping through jobs in a block> If I go to the directory where the R program and the data reside and apply the same command by hand R CMD BATCH dc19at1.R dc19at1.out it works with no problem. But if I use a similar program (SAS program) that has been executing the same command successfully for thousand of jobs in other blocks, the jobs for the block 19 fail. Error: ReadItem: unknown type 98, perhaps written by later version of R Execution halted even in the one I just mentioned if I execute by hand goes well. Do you know what could be the cause of bsub submission to fail? Any remedy? Thank you in advance, Aldi --
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793