Skip to content
Prev 106320 / 398506 Next

Memory problem on a linux cluster using a large data set [Broadcast]

In addition to my off-list reply to Iris (pointing her to an old post of
mine that detailed the memory requirement of RF in R), she might
consider the following:

- Use larger nodesize
- Use sampsize to control the size of bootstrap samples

Both of these have the effect of reducing sizes of trees grown.  For a
data set that large, it may not matter to grow smaller trees.

Still, with data of that size, I'd say 64-bit is the better solution.

Cheers,
Andy

From: Martin Morgan
------------------------------------------------------------------------------
Notice:  This e-mail message, together with any attachments,...{{dropped}}