Skip to content
Prev 240363 / 398500 Next

avoiding too many loops - reshaping data

Here is the summary of methods. tapply is the fastest!

library(reshape)

system.time(for(i in 1:1000)cast(melt(mydf, measure.vars = "value"),
city ~ brand,fun.aggregate = sum))
  user  system elapsed

 18.40    0.00   18.44

library(reshape2)
system.time(for(i in 1:1000)dcast(mydf,city ~ brand, sum))
  user  system elapsed
 12.36    0.02   12.37


system.time(for(i in 1:1000)xtabs(value ~ city + brand, mydf))

 user  system elapsed

  2.45    0.00    2.47


system.time(for(i in 1:1000)tapply(mydf$value,mydf[c('city','brand')],sum))

  user  system elapsed

  0.78    0.00    0.79

Dimitri
On Wed, Nov 3, 2010 at 4:32 PM, Henrique Dallazuanna <wwwhsd at gmail.com> wrote: