Easiest Road to Parallel R?
Tom,
On 21 July 2009 at 08:33, Thomas Hampton wrote:
| We have a substantial beowulf cluster and would like to | get parallel R going. Our systems administrators attempted | without success to get the R function papply to run | properly without success. I passed their comments/questions on to this | list in a previous message. The way I understand it, the various | pieces are | there and report no errors, but the final result is that no | parallelism is achieved. | | Is there some more bullet-proof route to parallel R than mpich2, Rmpi | and papply? | | We are on a beowulf cluster, red hat linux. There are a few questions here that may profit from separation: 0: Should you use R in parallel? Yup, so that's a given. 1: What _software level_ is recommended? If you follow the Schmidberger et al survey paper (linked from the CRAN Task View on High Performance Computing and otherwise to be had via Google or 'real soon now' at JSS) then you land at Rmpi and Snow. 2: Given a stated preference for Rmpi, how do you get it going? Hao Yu does an admirable job trying to let the configure script find Open MPI, LAM, MPICH2, DeinoMPI, ... I have had good results with Rmpi on Debian and Ubuntu but had to at times makes changes to the configure script which Hao then incorporated. Rmpi and friends tend to work out of the box on Debian and Ubuntu, using the binaries provided by the distro. 3: Given that you are on RH system, maybe you should also seek help on r-sig-fedora for the distro-specific hints. And as a general rule that we echo often here, test components in 'layers'. I.e. before attempting to get Rmpi installed, verify that you actually send an MPI variant of "hello, world" around etc. Hth, Dirk
Three out of two people have difficulties with fractions.