Distributed computing
On 15 October 2011 at 23:00, Akshay Jain wrote:
| Hi everyone | | I have about 5-6 old laptops which are lying waste. I want to do some large | data analysis using neural network algorithms.what is the best way to | connect them to a grid in order to pool their CPU resources? | | There is a package "gridR" or RHIPE? Which is the best package for my | needs? What is the one which requires minimum technical knowledge as I am | not from an IT/computer science background, so not familiar with java etc. We wrote a survey paper on the 'state of the art in parallel computing with R' (see http://www.jstatsoft.org/v31/i01). We found Rmpi and snow to be dominant in most use cases -- and I still find their setup easier than Hadoop but others may differ. You can get these laptops to use in a quickly built cluster simply by drpping Ubuntu or Debian onto them, but it helps if you know some Unix/Linux tricks and know eg how to propaget ssh keys. Distributed computing with minimal IT knowledge is unfortunately a little bit of a contradiction in terms. Your easiest bet may be to donate the laptops and buy a cheap four or six core box and rely on the multicore package---or the parallel package in R 2.14.0 due out in two weeks. Dirk
"Outside of a dog, a book is a man's best friend. Inside of a dog, it is too dark to read." -- Groucho Marx