Resources for optimizing code
On Fri, 5 Nov 2004, Roger Bivand wrote:
On Fri, 5 Nov 2004, Janet Elise Rosenbaum wrote:
I want to eliminate certain observations in a large dataframe (21000x100). I have written code which does this using a binary vector (0=delete obs, 1=keep), but it uses for loops, and so it's slow and in the extreme it causes R to hang for indefinite time periods. I'm looking for one of two things: 1. A document which discusses how to avoid for loops and situations in which it's impossible to avoid for loops. or 2. A function which can do the above better than mine.
?subset newdata <- subset(DATAFRAME, asst==1) which will work whether DATAFRAME is a matrix or data.frame (two different classes).
Sorry, not for matrices:
A <- matrix(1:20, 5) asst <- c(1,0,0,1,0) subset(A, asst)
[1] 1 4 6 9 11 14 16 19 Maybe it should, but in biggish problems like this it is almost certainly a bit more efficient to use the bare tools, that is indexing.
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595