tabulate
Bill Venables <William.Venables@cmis.CSIRO.AU> writes:
OK Peter. This is the first one I cooked up:
...
m <- rpois(100000, 1) tabulate(m)
[1] 36891 18399 6064 1519 309 50 4 1
table(m)
m
0 1 2 3 4 5 6 7 8
36763 36891 18399 6064 1519 309 50 4 1
system.time(tabulate(m))
[1] 0.11 0.00 0.00 0.00 0.00
system.time(table(m))
[1] 2.90 0.16 4.00 0.00 0.00
version
OK first, notice that I get:
system.time(table(m))
[1] 3.38 0.00 3.38 0.00 0.00
system.time(f<-factor(m))
[1] 2.12 0.00 2.12 0.00 0.00
system.time(table(f))
[1] 1.19 0.00 1.20 0.00 0.00 so most of the time really goes into factor(). If one is careful about the innards of table() one can shave the time for that to
system.time(tab2(f))
[1] 0.66 0.01 0.67 0.00 0.00 Rather interestingly, the non constant time part of table would seem equivalent to
system.time(as.integer(0)+as.integer(1)*(as.integer(f)-as.integer(1)))
[1] 0.25 0.00 0.25 0.00 0.00
system.time(as.integer(0)+as.integer(1)*(as.integer(f)-as.integer(1)))
[1] 0.07 0.00 0.07 0.00 0.00 Notice the huge difference in the two executions, indicating that the number of garbage collections involved probably play a major role. On the whole it doesn't really seem to be worth it to obtimize this very heavily, but if you have any obvious improvements for factor()...
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard@biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-devel-request@stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._