Skip to content

rmpi vs snow - which one is better from communication overhead point of view

2 messages · rdxcheena, Stephen Weston

#
Hi,

I need to understand when is it best to use /rmpi/ and when is it best to
use /snow/ for parallel programming in R? I understand snow can be used for
a group of non-clustered work stations also. But I wish to understand from
the point of view of using both on clusters for a problem which has few
chunks of straightforward data-parallelism interleaved with some
communication. Since both are based on /mpi/, which one provides better
performance for same kind of communication? Can I do explicit send, receive,
broadcast, etc with snow?

Also, if I use /foreach/ on either of these, does this add further overhead?

Please help me understand the difference in the provisions of the two and
select one of them for my current and future projects.

Thanks a lot in advance.

Best,

Aditi

--
View this message in context: http://r.789695.n4.nabble.com/rmpi-vs-snow-which-one-is-better-from-communication-overhead-point-of-view-tp4260660p4260660.html
Sent from the R devel mailing list archive at Nabble.com.
#
On Wed, Jan 4, 2012 at 4:57 AM, rdxcheena <rdxcheena at gmail.com> wrote:
Snow uses MPI via the Rmpi package, so you can always write equivalent
code in Rmpi that is at least as fast as snow.  You might want to read the
paper "State of the Art in Parallel Computing with R" by Markus Schmidberger,
Martin Morgan, Dirk Eddelbuettel, Hao Yu, Luke Tierney, and Ulrich
Mansmann for a more information on that subject.
Snow doesn't provide any explicit communication operations, unless you
count clusterExport.
Yes, foreach will definitely add overhead, and it doesn't give you access to
explicit communication either.
If you're primarily interested in performance, you should almost certainly
pick Rmpi.  And if you want to perform explicit MPI communication, such as
broadcasting, it's your only choice as far as I know.

- Steve