colSums in C
David Brahm <brahm at alum.mit.edu> writes:
I asked how to write speedy C code to implement colSums(). My original version on a 400x40000 matrix took 5.72s. Peter Dalgaard <p.dalgaard at biostat.ku.dk> suggested some more efficient coding, which sped my example up to 3.90s. Douglas Bates <bates at stat.wisc.edu> suggested using .Call() instead of .C, and I was amazed to see the time went down to 0.69s! Doug had actually posted his code (a package called "MatUtils") to R-help on July 19, 2001. I've taken Doug's code, added names to the result, and included an na.rm flag. Unfortunately, my na.rm option makes it really slow again! (12.15s). That's no faster than pre-processing the matrix with "m[is.na(m)] <- 0". Can anyone help me understand why the ISNA conditional is taking so much time? The C code is below. Thanks!
if (narm) for (j = 0; j < p; j++) {
for (sum = 0., i = 0; i < n; i++) if (!ISNA(mm[i])) sum += mm[i];
ISNA maps to the *function* R_IsNA and function calls are expensive. Also, you are probably breaking some pipelining with the extra conditional. Just for testing, what happens if you use isnan() instead? We could potentially set things up so that the compiler gets a chance to inline R_IsNA and friends, so I wonder how much we might gain.
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._