Erlang-style message-passing in R: Rmpi, Snow, NetWorkSpaces, etc.
What would you say typically limits taskPR's approach, not finding enough instruction-level parallelism at the R script level, or the communications overhead (probably latency) of trying to make use of it?
Depends on the specific function. The communication cost is significant, especially serialization and deserialization. (Since I finally found the right way to force a flush of the TCP data, the actual network cost isn't a problem for moderate sized data.) For reasons of simplicity of implementation and ease of correctness, a lot of the R environment is serialized and sent over with *each* operation. In terms of the instruction-level parallelism available, code that is a performance bottle-neck is usually re-written in C or Fortran and called in large blocks. So now the program is trying to find parallelism in the large blocks, which it usually can't. I didn't have a lot of suitable code to try, and so the best example program was one that did a complex calculation followed by an accumulate operation in a loop. Parallel-R/taskPR dynamically unrolled the loop (just like Tomosulo's algorithm does on a processor) and got a reasonable speedup (about half of linear). Unfortunately, I don't even have that code example any more.
If latency, then perhaps taskPR would work better in a multi-threaded R interpreter, rather than across a TCP/IP network fabric.
Yes, most especially if serialization and deserialization could be avoided. However, I don't believe R is thread-safe? (Using shared memory, but between multiple R processes, was on the TODO list when the project ended.) I was fortunate to have access to a very large NUMA machine at the time that I was originally working on this project, so the network itself wasn't a limiting factor. (The network stack turned out to be a problem, though.) David Bauer