Erlang-style message-passing in R: Rmpi, Snow, NetWorkSpaces, etc.
On Thu, Sep 04, 2008 at 04:06:31PM -0400, David Bauer wrote:
taskPR was an attempt to get 'free' parallelism out of already existing programs by using simple data dependencies to figure out which individual statements in a program can be run in parallel. The name comes from the description of the program as exploiting task-level parallelism.
Ah, and thus your reference to Tomasulo's algorithm, interesting. Thanks for straightening me out there. http://users.ece.gatech.edu/~gte810u/Parallel-R/
(If anybody actually uses or has successfully used this package, I would love to hear about it, btw. While the package *does* work, there are probably few cases where it is worth it.)
What would you say typically limits taskPR's approach, not finding enough instruction-level parallelism at the R script level, or the communications overhead (probably latency) of trying to make use of it? If latency, then perhaps taskPR would work better in a multi-threaded R interpreter, rather than across a TCP/IP network fabric. To roughly test that empirically (assuming you are in fact using MPI for the communications), I suppose you could start up your several R processes on a single fat SMP node, and use an MPI that sends messages through fast shared memory. That's probably still slower than thread-to-thread communications, but it should be much lower latency than TCP/IP. Maybe you already tried something like that?
Andrew Piskorski <atp at piskorski.com> http://www.piskorski.com/