Skip to content

R code for performance

5 messages · v.demart@libero.it, Eric Lecoutre, Brian Ripley +1 more

#
At office I'm cautiously introducing R to be used as the basic statistical
program, getting rid of licensed stuff or reducing the amount of it.
The aim of R would be to run generic statistical programs built & "consumed"
when needed and some static procedure dealing with time-series.
Now, we have substantially 3 OS platforms, win xp, linux and freebsd 5.4,
on similar PCs (pentium 4, 2-2.5 GHz). I have been asked by the boss to
test the "average" performance (in term of speed and memory use) of R on
each of this platform to stick with one of them on a couple of PCs.

Could you please suggest an R source code (apart from the "static procedure"
I will obviously test) to be run on the three platforms to test performance?

If there is nothing of the kind, any suggestion?

Ciao
Vittorio
#
You could use the benchmark created by Philippe Grosjean to compare
various statistical packages. You will find it at:

http://www.sciviews.org/benchmark/

Note that you have to ensure to have installed packages: Matrix and
SuppDist

HTH,

Eric

Eric Lecoutre
UCL /  Institut de Statistique
Voie du Roman Pays, 20
1348 Louvain-la-Neuve
Belgium

tel: (+32)(0)10473050
lecoutre at stat.ucl.ac.be
http://www.stat.ucl.ac.be/ISpersonnel/lecoutre

If the statistics are boring, then you've got the wrong numbers. -Edward
Tufte
#
On Mon, 6 Jun 2005 v.demartino2 at virgilio.it wrote:

            
'make check' runs a lot of R code and times it.  The tests for the stats 
package look most relevant to you.  Beware of simplistic 'benchmarks' that 
test code snippets not relevant to your usage (and that may apply to the R 
examples which tend to be small datasets).

We know Linux (non-R-shlib) outperforms Windows XP by ca 20%, and some 
comments I have seen here suggest it outperforms FreeBSD as well.  But are 
such differences enough to determine your choice?
#
Prof Brian Ripley <ripley at stats.ox.ac.uk> writes:
The scripts from the MASS package can also be used as an informal
benchmark, perhaps a bit more of a realistic mix than the stats
package. (Or was there a reason that Brian didn't mention them?)

It might also be relevant to note that, at least for a while, there
isn't going to be a 64 bit Windows version (the compiler etc. tool
chain is missing) so if you have large memory requirements, Linux or
BSD is the way to go. They also tend to be much easier to get
configured for building your own packages or just for using C/Fortran
extensions. The flip side is of course the (perceived)
userfriendliness of Windows.

If you have hardcore linear algebra requirements (e.g. inversion of
large matrices), you need to look into builds linked against fast BLAS
code (Goto, ATLAS). Most of the standard builds do not use this, so
benchmarks will be quite misleading.
#
:-- Messaggio originale --
:Date: Mon, 6 Jun 2005 09:40:40 +0100 (BST)
:From: Prof Brian Ripley <ripley at stats.ox.ac.uk>
:To: v.demartino2 at virgilio.it
:cc: r-help <r-help at stat.math.ethz.ch>
:Subject: Re: [R] R code for performance
:
:
:On Mon, 6 Jun 2005 v.demartino2 at virgilio.it wrote:
:
:> At office I'm cautiously introducing R to be used as the basic statistical
:> program, getting rid of licensed stuff or reducing the amount of it.
:> The aim of R would be to run generic statistical programs built & "consumed"
:> when needed and some static procedure dealing with time-series.
:> Now, we have substantially 3 OS platforms, win xp, linux and freebsd
5.4,
:> on similar PCs (pentium 4, 2-2.5 GHz). I have been asked by the boss
to
:> test the "average" performance (in term of speed and memory use) of R
on
:> each of this platform to stick with one of them on a couple of PCs.
:>
:> Could you please suggest an R source code (apart from the "static procedure"
:> I will obviously test) to be run on the three platforms to test performance?
:>
:> If there is nothing of the kind, any suggestion?
:
:'make check' runs a lot of R code and times it.  The tests for the stats
:
:package look most relevant to you.  Beware of simplistic 'benchmarks' that
:
:test code snippets not relevant to your usage (and that may apply to the
:R 
:examples which tend to be small datasets).
:
:We know Linux (non-R-shlib) outperforms Windows XP by ca 20%, and some

:comments I have seen here suggest it outperforms FreeBSD as well.  But
are
:
:such differences enough to determine your choice?
:

Thinking of my time-series procedure I would answer NO definitely. 
But the fact is that we have to do lots of simulations most of them using
Montecarlo with many iterations, spending our time in modifying and trying
different hypoteses . And the problem is that these simulations must be
done "on the fly" because, as we ironically say, they're needed "for yesterday".
Therefore there's no much time left to refine the R code and get the best
out of it. So, in this case, having a Ferrari makes the difference!

Ciao
Vittorio