Skip to content
Prev 248093 / 398503 Next

which operating system + computer specifications lead to the best performance for R?

The decision will depend on exactly
what you are doing and how you do it. It is not hard to get to the point
where you do less with more. For example,

http://spectrum.ieee.org/computing/hardware/multicore-is-bad-news-for-supercomputers

I had a recent case where a bash script
I had on our multi-core 'dohs server with cygwin? ran about as fast or faster on a 
emachines I got second hand with less than 1Gb of memory running Debian. 
This is not going to be typical, but if you care about performance often
you will be more concerned with "how you use R" rather than the machine
in isolation. You'll find you can handle bigger problems with less memory
if you make your data structures and algorithms work together so
that you access memory in predictable ways. VM can be quite tolerable
as long as you don't start thrashing but alternatively if you have 
streaming data sources and sinks and can make block oriented algorithms,
you don't need to buffer a bunch of junk only to have it stepping
on the other junk. This is as much a statement of r developers
as you. My favorite example from personal experience, not using R,
is another case where I was using a laptop to do a bunch of string
manipulations. With large data sets it turned out to be faster if I 
sorted the large data set before passing it to the program that did
all the work( which "SHOULD BE" CPU limited). You don't expect a sort to be fast, and since the following
program did not know the data was sorted it couldn't be expected
to benefit from this. However, the more regular memory accesses in the
latter program avoided VM problems and the speed up went from unusable
to no big problem ( disk access is 1e6 times slower or worse than any RAM
and even a fast disk makes this .5e6 which is still large even when you get lots
of data at once or may have some buffered in OS dependent way etc. ).

And finally, if you really need speed and can't limit everything to optimized
library calls, you may need to write your own compiled code.