Skip to content
Prev 269065 / 398502 Next

More efficient option to append()?

As I already stated in my reply to your earlier post:

resending the answer for the archives of the mailing list...

Hi Alex,

The other reply already gave you the R way of doing this while avoiding
the for loop. However, there is a more general reason why your for loop
is terribly inefficient. A small set of examples:

largeVector = runif(10e4)
outputVector = NULL
system.time(for(i in 1:length(largeVector)) {
    outputVector = append(outputVector, largeVector[i] + 1)
})
#   user  system elapsed
 # 6.591   0.168   6.786

The problem in this code is that outputVector keeps on growing and
growing. The operating system needs to allocate more and more space as
the object grows. This process is really slow. Several (much) faster
alternatives exist:

# Pre-allocating the outputVector
outputVector = rep(0,length(largeVector))
system.time(for(i in 1:length(largeVector)) {
    outputVector[i] = largeVector[i] + 1
})
#   user  system elapsed
# 0.178   0.000   0.178
# speed up of 37 times, this will only increase for large
# lengths of largeVector

# Using apply functions
system.time(outputVector <- sapply(largeVector, function(x) return(x + 1)))
#   user  system elapsed
#  0.124   0.000   0.125
# Even a bit faster

# Using vectorisation
system.time(outputVector <- largeVector + 1)
#   user  system elapsed
#  0.000   0.000   0.001
# Practically instant, 6780 times faster than the first example

It is not always clear which method is most suitable and which performs
best. At least they all perform much, much better than the naive option
of letting outputVector grow.

cheers,
Paul
On 08/17/2011 11:17 PM, Alex Ruiz Euler wrote: