What is the most cost effective hardware for R?
On 05/08/2012 06:02 PM, Rich Shepard wrote:
On Tue, 8 May 2012, Hugh Morgan wrote:
Perhaps I have confused the issue. When I initially said "data points" I meant one stand alone analysis, not one piece of data. Each analysis point takes 1.5 seconds. I have not implemented running this over the whole dataset yet, but I would expect it to take about 5 to 10 hours. This is just about acceptable, but it would be better if this was quicker. As I say, the exact analysis method has not yet been determined, and if that was significantly more computationally intensive then that could be an issue.
If I had to do what you write above, I would separate the data into chunks; one for each core/CPU in my system. Then I would invoke R to run on each core/CPU and have that instance process one data set. With sufficient memory for each core/CPU the processing will occur in parallel and cut the overall time by the number of instances running. You might want to turn up the air conditioning around the system 'cause that CPU is going to be working hard.
That is roughly how I am working on getting it running currently, and the 5 hour estimate assumes that is perfectly parallelisable. We have a server room with a reasonable air con. I have only just thought about adding the extra cooling to the total cost, but I suspect that that will come from a different budget so may not matter so much. I shall include it in the quote until told to do otherwise.
Rich
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
This email may have a PROTECTIVE MARKING, for an explanation please see: http://www.mrc.ac.uk/About/Informationandstandards/Documentmarking/index.htm