How can I avoid nested 'for' loops or quicken the process?
Thankyou for the clarification, Brian. This is very helpful (as usual). However, I think the important point, which I misstated, is that whether it be for() or, e.g. lapply(), the "loop" contents must be evaluated at the interpreted R level, and this is where most time is typically spent. To get the speedup that most people hope for, avoiding the loop altogether (i.e. moving loop **and** evaluations) to C level, via R programming -- e.g. via use of matrix operations, indexing, or built-in .Internal functions, etc. -- is the key. Please correct me if I'm (even partially) wrong. As you know, the issue arises frequently. -- Bert Gunter Genentech -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Prof Brian Ripley Sent: Friday, December 26, 2008 12:44 AM To: Oliver Bandel Cc: r-help at stat.math.ethz.ch Subject: Re: [R] How can I avoid nested 'for' loops or quicken the process?
On Thu, 25 Dec 2008, Oliver Bandel wrote:
Bert Gunter <gunter.berton <at> gene.com> writes:
FWIW: Good advice below! -- after all, the first rule of optimizing code is: Don't! For the record (yet again), the apply() family of functions (and their packaged derivatives, of course) are "merely" vary carefully written
for()
loops: their main advantage is in code readability, not in efficiency
gains,
which may well be small or nonexistent. True efficiency gains require "vectorization", which essentially moves the for() loops from interpreted code to (underlying) C code (on the underlying data structures): e.g. compare rowMeans() [vectorized] with ave() or apply(..,1,mean).
[...] The apply-functions do bring speed-advantages. This is not only what I read about it, I have used the apply-functions and really got results faster. The reason is simple: an apply-function does make in C, what otherwise would be done on the level of R with for-loops.
Not true of apply(): true of lapply() and hence sapply(). I'll leave you to check eapply, mapply, rapply, tapply. So the issue is what is meant by 'the apply() family of functions': people often mean *apply(), of which apply() is an unusual member, if one at all. [Historical note: a decade ago lapply was internally a for() loop. I rewrote it in C in 2000: I also moved apply to C at the same time but it proved too little an advantage and was reverted. The speed of lapply comes mainly from reduced memory allocation: for() is also written in C.]
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.