Computing sums of the columns of an array
On 8/5/2005 12:43 PM, Uwe Ligges wrote:
Duncan Murdoch wrote:
On 8/5/2005 12:16 PM, Martin C. Martin wrote:
Hi, I have a 5x731 array A, and I want to compute the sums of the columns. Currently I do: apply(A, 2, sum) But it turns out, this is slow: 70% of my CPU time is spent here, even though there are many complicated steps in my computation. Is there a faster way?
You'd probably do better with matrix multiplication: rep(1, nrow(A)) %*% A
No, better use colSums(), which has been optimized for this purpose: A <- matrix(seq(1, 10000000), ncol=10000) system.time(colSums(A)) # ~ 0.1 sec. system.time(rep(1, nrow(A)) %*% A) # ~ 0.5 sec.
I didn't claim my solution was the best, only better. :-) One point of interest: I think your example exaggerates the difference by using a matrix of integers. On my machine I get a ratio something like yours with the same example > A <- matrix(seq(1, 10000000), ncol=10000) > system.time(colSums(A)) [1] 0.08 0.00 0.08 NA NA > system.time(rep(1, nrow(A)) %*% A) [1] 0.25 0.01 0.23 NA NA but if I make A floating point, there's much less difference: > A <- matrix(as.numeric(seq(1, 10000000)), ncol=10000) > system.time(colSums(A)) [1] 0.09 0.00 0.09 NA NA > system.time(rep(1, nrow(A)) %*% A) [1] 0.11 0.00 0.12 NA NA Still, colSums is the winner in both cases. Duncan Murdoch