Seeing some memory leak with foreach...
Hi Jonathan,
I've run into a similar problem before. It took more than 2 weeks to
track down, but when I did, it turned out to be associated with
resolving dynamically-linked symbols ('getNativeSymbolInfo'). My code
was doing a lot of this very frequently, which led to memory leak.
Once I found the source of the problem, I reworked my code to avoid
this (a good idea anyway since the symbol-resolution was largely
redundant) and never got to the very bottom of the problem, which may
very well have been in the linux kernel rather than in R itself. This
may have nothing to do with the problem you're experiencing: if it
does, I hope this note will save you some time. It would be
interesting to hear about the source of the memory leaks, whatever it
turns out to be.
Have you tried another parallel backend or a parallelization approach
other than 'foreach'?
Aaron
On Tue, Feb 26, 2013 at 9:49 AM, Jonathan Greenberg <jgrn at illinois.edu> wrote:
r-sig-geo'ers: I always hate doing this, but the test function/dataset is going to be hard to pass along to the list. Basically: I have a foreach call that has no superassignments or strange environmental manipulations, but resulted in the nodes showing a slow but steady memory creep over time. I was using a parallel backend for foreach via doParallel. Has anyone else seen this behavior (unexplained memory creep)? Is there a good way to "flush" a node? I'm trying to embed gc() at the top of my foreach function, but this process took about 24 hours to get to a memory overuse stage (multiple iterations would have passed, e.g. the function would have been called more than one time on a single node) so I'm not sure if this will work so I figured I'd ask the group about it. I've seen other people post about this on various boards with no clear response/solution to it (gc() apparently didn't work). Some other notes: there should be no resultant output of data, because the output is being written from within the foreach function (e.g. the output of the function that foreach executes is NULL). I'll see if I can work up a faster executing example later, but wanted to see if there are some general pointers for dealing with memory leaks using a parallel system. --j -- Jonathan A. Greenberg, PhD Assistant Professor Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Department of Geography and Geographic Information Science University of Illinois at Urbana-Champaign 607 South Mathews Avenue, MC 150 Urbana, IL 61801 Phone: 217-300-1924 http://www.geog.illinois.edu/~jgrn/ AIM: jgrn307, MSN: jgrn307 at hotmail.com, Gchat: jgrn307, Skype: jgrn3007
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
Aaron A. King, Ph.D. Ecology & Evolutionary Biology Mathematics Center for the Study of Complex Systems University of Michigan GPG Public Key: 0x15780975