Skip to content
Prev 1684 / 2152 Next

Unreproducable crashes of R-instances on cluster running Torque

Simon Urbanek <simon.urbanek at r-project.org> writes:
If I remember correctly, memory fragmentation plays an important role
for R (still in version 3.0.0?), so that one continuous memory block
needs to be available to be used - otherwise one can get these error
messages even if enough memory is available, but fragmented in smaller
blocks (or does torque take care of memory fragmentation?)
Thanks for this discussion - because these are exactly the symptoms I
experienced and could not make sense of (i.e. crashing R sessions on the
cluster, hanging nodes which needed to be restarted to work again) - as
I assumed that torque would protect the node from crashing due to much memory usage. 

One point is mentioned here again and again: monitor memory usage. But
is there an easy way to do this? Can I submit a script to torque and get
back a memory report in a log file, which I can analyse to get memory
usage over time?

Rainer