Skip to content

Memory Fragmentation in R

4 messages · Nawaaz Ahmed, Luke Tierney, Brian Ripley

#
Thanks Brian. I looked at the code (memory.c) after I sent out the first 
email and noticed the malloc() call that you mention in your reply.
Looking into this code suggested a possible scenario where R would fail 
in malloc() even if it had enough free heap address space.

I noticed that if there is enough heap address space (memory.c:1796, 
VHEAP_FREE() > alloc_size) then the garbage collector is not run. So 
malloc could fail (since there is no more address space to use), even 
though R itself has enough free space it can reclaim. A simple fix is 
for R to try doing garbage collection if malloc() fails.

I hacked memory.c() to look in R_GenHeap[LARGE_NODE_CLASS].New if 
malloc() fails (in a very similar fashion to ReleaseLargeFreeVectors())
I did a "best-fit" stealing from this list and returned it to 
allocVector(). This seemed to fix my particular problem - the large 
vectors that I had allocated in the previous round were still sitting in 
  this list. Of course, the right thing to do is to check if there are 
any free vectors of the right size before calling malloc() - but it was 
simpler to do it my way (because I did not have to worry about how 
efficient my best-fit was; memory allocation was anyway going to fail).

I can look deeper into this and provide more details if needed.

Nawaaz
Prof Brian Ripley wrote:
#
On Sat, 19 Feb 2005, Nawaaz Ahmed wrote:

            
Thanks.  It looks like it would be a good idea to modify the malloc at
that point to try a GC if the malloc fails, then retry the malloc and
only bail if the second malloc fails.  I want to think this through a
bit more before going ahead, but I think it will be the right thing to
do.

Best,

luke

  
    
#
I am not the expert here (the author, Luke Tierney, is probably 
listening), but I understood you to have done a gc() immediately before 
your second run: you presented statistics from it.  If so, then I don't 
understand in detail.  Probably Luke does.

That's good general advice: clear out results you no longer need and run 
gc() before starting a memory-intensive task (and it also helps if you are 
timing things not to include the time of gc()-ing previous work).
I did sometimes run gc() at the end of each simulation run just to 
ensure that malloc has the maximal chance to clean up the allocations, in 
32-bit days.
On Sat, 19 Feb 2005, Nawaaz Ahmed wrote:

            
I don't think that quite corresponds to your words: it is rather that 
successful allocation would not provoke a gc (unless gc.torture is on).
I believe running ReleaseLargeFreeVectors would suffice.
They should have been released by the gc() you presented the statistics 
from, and they would have been included in those statistics if still in 
use at that point. So, I don't understand why they are still around.
I rather doubt that is better than letting the malloc sort this out, as it 
might be able to consolidate blocks if given them all back at once.
I am unclear what you actually did, but it may be a judicious gc() is all 
that was needed: otherwise the issues should be the same in the first and 
the subsequent run.  That's not to say that when the trigger gets near the 
total address space we could not do better: and perhaps we should not let 
it to do so (if we could actually determine the size of the address space 
... it is 2Gb or 3Gb on Windows for example).
#
I did do gc() but only at the top level functions - there were internal 
functions in libraries/packages that were allocating space.

Here is how I think the problem happens. Consider code of the form
         x = as.vector(x)
	y = as.double(y)
where x is a 500MB matrix, y is 100 MB

Let's say we have 1201MB totally.
	Initially:
            x has 500MB, y has 100MB
            heap can grow by 601MB

	x = as.vector(x):
	   x has 500 MB, y has 100MB
            as.vector() duplicated 500MB (to be garbage collected)
            heap can grow by 101 MB

         y = as.vector(y)
            x has 500 MB, y has 100 MB
            R has 500 MB to be garbage collected
            as.vector() requires 100MB for duplicating y
            garbage collector is not run
                - required amount (100MB) < possible heap growth (101MB)
	   allocVector() calls malloc()
                - malloc() can fail at this point
                - it cannot get contiguous 100MB

You are right, it is most likely to happen close to the trigger. But the 
fix should be easy (call gc() if malloc() fails) - I initially hacked at
trying to steal vectors from the free list because I thought the problem 
I was seeing was due to address space fragmentation. The latter could 
still be a problem and would be harder to fix.

Thanks Luke and Brian!
Nawaaz