Skip to content
Prev 1782 / 2152 Next

HPC with standard R functions

On 09/28/2013 01:19 PM, Simone Ruzza wrote:
I'm not sure you've told us enough to answer you.

If your task is repetitive (such as Monte Carlo analysis), then the 
answer is most likely yes.

If your data can be partitioned, and your model can be fit on the 
partitions, then the answer is most likely yes, you can parallelize it.

If your model can be partitioned, so that some or all of the 
sub-functions from other packages that you mention can be called in 
parallel on your large data, then the answer is most likely yes.

In terms of technology to use, at this point you'd have to tell us about 
the cluster you want to run it on, which would then help us decide 
whether you should be looking at 'parallel',now part of base R, 
'foreach' which has what I believe to be the very nice property of 
writing code that can use any or no parallel backends without changing 
your code, or something very specific like Rmpi because the cluster you 
hope to use uses that as its parallel backend. (there are other possible 
endpoints too, but these seem to be the most popular)

But from what I read above, you haven't given us enough detail about 
what you need to do for me at least to say anything definitive.

Regards,

Brian