Skip to content

R on a supercomputer

3 messages · Mark Kimpel, Tony Plate, Sean Davis

#
In general, R is not written in such a way that data remain in cache. 
However, R can use optimized BLAS libraries, and these are.   So if your 
version of R is compiled to use an optimized BLAS library appropriate to 
the machine (e.g., ATLAS, or Prof. Goto's Blas), AND a considerable 
amount of the computation done in your R program involves basic linear 
algebra (matrix multiplication, etc.), then you might see a good speedup.

-- Tony Plate
Kimpel, Mark William wrote:
#
On 10/10/05 3:54 PM, "Kimpel, Mark William" <mkimpel at iupui.edu> wrote:

            
Using the cluster model (which may or may not be what you are calling a
supercomputer--I don't know the exact terminology here), jobs that involve
repetitive, independent tasks like computing statistics on bootstrap
replicates can benefit from parallelization IF the "I/O" associated with
running the single replicate does not outweigh the benefit of using multiple
processors.  For example, if you are running 10000 replicates and each takes
1 ms, then you have a 10 second job on a single processor.  One could
envision spreading that same process over 1000 processors and doing the job
in 10 ms, but if one counts the I/O (network, moving into cache, etc.) which
could take 1 second per batch of replicates (for example), then that job
will take AT LEAST 10 seconds with 1000 processors, also.  However, if the
same computation takes 1 second per replicate, then the whole job takes
10,000 seconds on a single processor, but only about 11 seconds on the 1000
processors (approximately).  This rationale is only approximate, but I hope
it shows the point.

We have begun to use a 60-node linux cluster for some of our work (also
microarray-based) and use MPI/snow with very nice results for multiple
independent, long-running tasks.  Snow is VERY easy to use, but one could
also drop back to the Rmpi if needed, to have finer-grain control over the
parallelization process.

As for how caching behaviors come into it and how R without "parallelized"
R-code would perform, I can't really comment; my experience is limited to
the "cluster" model with parallelized R-code.

Sean