Skip to content
Prev 285222 / 398502 Next

different way for a for loop for several columns?

Hello,
Look at the first line in the inner loop:
It doesn't depend on 'y', could be put outside that loop. This alone would
save time.
What also saves time is to see that you're choosing values 1 and 0 depending
on the 'ifelse' condition.
To simply do

test.2 <- test[, l] > 1    # Use TRUE = 1 and FALSE = 0

makes it 20 times faster (!) (Tested with a much larger 'test'  data.frame.)
With no loops, only *apply:

# After creating the data frames save a copy of the original 'test.1'
# (This line should be before the loops)
test.1b <- test.1

# Now, after the loops and after attributing names to the result, try the
following.

# Could this become 'unique(Year$Date)' ?
y <- 1980:1983

# This is a matrix, not vector by vector like above. It's the matrix 'xx' in
the nested 'apply'
test.2b <- test > 1

# This is the index 'jj' into 'xx'
test.3b <- sapply(y, function(yy) which(Year$Date == yy))

Length <- t(apply(test.3b, 2, function(jj)
			apply(test.2b, 2, function(xx) sum(xx[jj]))))

# Maybe we don't need to save 'test.1b', if it's possible to cbind(y,
Length)
test.1b <- cbind(test.1b, Length)

names(test.1b) <- c("year", "C", "B", "F")

all.equal(test.1, test.1b)

I bet this is faster.

A final note, it's not a very good idea to use R's objects as variables
names .
For instance, 'length'. Prefer 'Length', it's not a object/function.

Hope this helps,

Rui Barradas



--
View this message in context: http://r.789695.n4.nabble.com/different-way-for-a-for-loop-for-several-columns-tp4385705p4385992.html
Sent from the R help mailing list archive at Nabble.com.