Skip to content

speeding up functions for large datasets

3 messages · Freja.Vamborg@astrazeneca.com, Brian Ripley, Jean Eid

#
Dear R-helpers, 
I'm dealing with large datasets, say tables of 60 000 times 12 or so, and
some of the functions are (too ) slow and I'm therefore trying to find ways
to speed them up.
I've found that for instance for-loops are slow in R (both by testing and by
searching through mail archives etc )
Are there any more well known arguments that are slow in R, ,maybe at data
representation level, code-writing, reading in the data.
I've also tried incorporating C-code, which works well, but I'd also like to
find other, maybe more "shortcut" ways.

Thanks in advance, 
Freja!
#
On Fri, 6 Aug 2004 Freja.Vamborg at astrazeneca.com wrote:

            
I don't think that is really true, but it is the case that using
row-by-row operations in your situation would be slow *if they are
unnecessary*. It is a question of choosing the right algorithmic approach,
not whether it is implemented by for-loops or lapply or ....
`S Programming' (see the R FAQ) has a whole chapter on this sort of thing, 
with examples.  More generally you want to take a `whole object' view and 
use indexing and other vectorized operations.

Note also that what is slow does change with the version of R and 
especially how much memory you have installed.  The first step is to get 
enough RAM.
#
you might want to turn your data into a matrix. You get much much faster
for  loops doing that.

Jean,
On Fri, 6 Aug 2004 Freja.Vamborg at astrazeneca.com wrote: