Skip to content

exceeding memory causes crash on Linux

5 messages · Paul Gilbert, Luke Tierney, Peter Dalgaard

#
I am having an unusual difficulty with R 1.8.0 on Mandrake 9.1 running a 
problem that takes a large amount of memory. With R 1.7.1  this ran on 
the machine I am using (barely), but now takes more memory than is 
available.  The first two times I tried with R 1.8.0, R exited after the 
program had run for some time, and gave no indication of anything, just 
returned to the shell command prompt. I ran under gdb to see if I could 
get a bettter indication of the problem, and this crashed Linux 
completely, or at least X, but I couldn't get another console either. (I 
haven't had anything crash Linux in a long time.) To confirm this I ran 
R under gdb again, and ran top to verify I was hitting memory 
constraints (which I was), but this time R did give a message "Error: 
cannot allocate a vector of size ..."

I'm not worried about running the problem, but I would like a more 
graceful exit. Might this be related to the change in error handling?

Paul Gilbert
#
Paul Gilbert wrote:

            
P.S. But there does not seem to be proper garbage collection after this. 
Top showed the memory still in use and subsequent attempts to run the 
program failed immediately trying to allocate a much smaller vector. 
When I did gc()  explicitely it did clean up and I could start the 
function again. The second time R exited back to the gdb prompt with a 
message "Program terminated with signal  SIGKILL, Killed. The program no 
longer exists."
#
Paul Gilbert wrote:

            
P.P.S. I can reproduce this (at least the SIGKILL part) on a machine 
with 500MB memory and swap turned off simply with
    z <- rnorm(50000000)
(Turning off swap is simply to make the failure happen quickly rather 
than slowly on a bigger problem.)
#
On Thu, 9 Oct 2003, Paul Gilbert wrote:

            
Possible but not likely.

When you really push memory for your R process to the limit you create
a situation where other programs may fail because there is no more
memory for them to get at.  At some point the kernel decides there is
a problem and starts doing things to bring the system into control by
the only means it really has: blowing away processes with a SIGKILL,
which cannot be caught so there is nothing R can do about it.  Or any
other process, say your X server, if that is the one that the kernel
decides to blow away.  I forget the actual rules the kernel uses to
handle these situations but that is the gist.

I don't think the kernel goes into this self-defense mode until it
gets close to running out of both physical memory and swap space.  One
thing you might try to do is check out how much swap space you have.
If you don't have enough you might try adding a swap file of several
gigabytes and see if that helps.

One thing to keep in mind when doing computations that produce huge
results is that R saves the last value of a successful top level
evaluation in .Last.value.  If that value is huge, gc can't do
anything about it until it is replaced by something smaller.  An
explicit call to gc() is not going to be able to release any more
things than an internal call made to satisfy an allocation, except to
the extent that some additional data will be reachable as part of the
computation that triggers the internal call.  Two successive top level
gc() calls may seem to do wonders compared to just one, just because
after the first one .Last.value has been replaced by the result
returned by gc().

The memory management system also does some of its releasing of
smaller sized allocations gradually to avoid thrashing in malloc in
most situations. This is why memory use as well as triggering
thresholds can go down gradually on successive gc calls until the
reach a steady state.  This is based on heuristics that work
reasonably well across a wide range of uses but might not be ideal for
really pushing the memory limit.  At some point we might make some of
the tuning parameters for these heuristics available at the user
level, but this isn't high priority as fiddling with them is probably
much more likely to make things worse than better.

Hope that helps,

luke
#
Paul Gilbert <pgilbert@bank-banque-canada.ca> writes:
I believe this is in the operating system, not in R. When all system
memory has been used up, the kernel goes looking for a job to kill,
and that can be R, the X server, or whatever it feels like. It is
difficult for R to do anything about it since the out-of-memory
condition can arise after the memory was allocated.