Hello: I've recently started using R to process data in HDF5 format. My files come in 1.5MB chunks, but they can be as big as 50MB. The problem I am facing with R-1.2.2 is that when I try to load 50 of the 1.5MB HDF5 files (using the hdf5 library) in a loop, my Linux box gets close to its memory limit around the file #15 (256MB RAM and 256MB swap). This happens even if I load file -> erase all the objects -> load file -> erase all the objects... When I try to load a single 50MB HDF5 file, the computer chokes before completing the job as well. I happen to know the author of the hdf5 library and I am very sure he knows what he says when he tells me that the HDF5 module for R only does very simple allocMatrix and allocVector, so garbage collection should work on that. So my questions are: 1) [newbie level 1000] The '"generational" garbage collector, which will increase the memory available to R as needed.' referred in (*) does also the job of freeing unused memory, I suppose. So, loading-HDF5-file -> erasing-all-the-objects should keep the size of R in memory more or less constant? Or at least, avoid that R eats up the whole memory to the point of hanging up my computer? 2) How could I test the garbage collection feature in my machine, supposing it releases unused memory, so to determine if the problem is specific to my platform (linux 2.2.14-1vl6) or due to R code? 3) Anyone else using HDF5 with R out there? (*) http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html#Why%20does%20R%20run%20out%20of%20memory%3f Thanks a lot! Eiji -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Memory problems, HDF5 library and R-1.2.2 garbage collection
3 messages · Norberto Eiji Nawa, Marcus G. Daniels, Trent Piepho
"NEN" == Norberto Eiji Nawa <eiji at isd.atr.co.jp> writes:
NEN> When I try to load a single 50MB HDF5 file, the computer chokes NEN> before completing the job as well. I'll check this out and make sure there isn't gratuitous waste happening. Problems with the big file sound plausible, but the smaller chunks should be doable. Thanks for the test cases, btw.. http://www.isd.atr.co.jp/~eiji/swarm/HDF5samples.tar.gz ArchiverHDF5.hdf.gz testHDF5load.R Plugin: ftp://ftp.swarm.org/pub/swarm/src/testing/hdf5_0.9.tar.gz -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 22 Mar 2001, Marcus G. Daniels wrote:
"NEN" == Norberto Eiji Nawa <eiji at isd.atr.co.jp> writes:
NEN> When I try to load a single 50MB HDF5 file, the computer chokes NEN> before completing the job as well. I'll check this out and make sure there isn't gratuitous waste happening. Problems with the big file sound plausible, but the smaller chunks should be doable. Thanks for the test cases, btw..
I'm using R 1.2.2 to read in large netCDF files. I've read in about a tenth of some 220MB netcdf files. I've processed 82 of these files in a row without restarting R and not had a problem with memory. So obviously R can read in large datasets, it must be a problem with the HDF module. I know that the netCDF 1.2 library is very inefficient at some things, and I've pretty much totally re-written it. -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._