distributed R on EC2, designing the software stack
you should contact Robert Grossman who just gave a presentation on this topic at R/Finance in Chicago. link: http://rinfinance.quantmod.com/speakers/ -Whit
On Wed, Apr 29, 2009 at 3:06 PM, Stephen J. Barr <stephenjbarr at gmail.com> wrote:
Greetings, I am trying to get into distributed computing with R, but do not have access to a cluster. Therefore, I am trying to get distributed R running on Amazon's EC2. ( http://aws.amazon.com/ec2/ ) For those of you who don't know, EC2 allows you to instantiate large numbers of computers, bundled with whatever OS and software configuration you want. From my survey of things, there are a lot of different options available for distributed computing. For my needs, I would just like to run simple Monte Carlo simulations, and other things that don't require a ton of inter-node communication. What I would like to do is put together a public AMI and a howto guide, such that it would be very easy for anyone to instantiate an N-node cluster and start with parallel computing. I would like to have a discussion/brainstorm over what the exact software stack should be. My initial thoughts were: 1) R 2.9.0 + OpenMPI + RMpi + Snowfall/sfCluster ? - will Amazon's network work with OpenMPI. Perhaps it would be better to use PVM or something that is more tolerant to non-optimal network 2) ?R 2.9.0 + "socket based communication" + Snowfall/sfCluster ?- is this scalable 3) ?R 2.9.0 + twisted + NetWorkSpaces ? - not sure of Amazon's network supports broadcast mode, which is required by twisted 4) Biocep-R ? - this looks like it has the functionality to do what I want, but a lot of other stuff as well. 5) RHIPE ? - Hadoop is well supported by EC2. Perhaps this is the way to go. Seems like a very new package :) What are people's thoughts on what would be a good software stack with the constraint that it should be simple and run on EC2? Thanks, -stephen ========================================== Stephen J. Barr University of Washington WEB: www.econsteve.com
_______________________________________________ R-sig-hpc mailing list R-sig-hpc at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-hpc