Speeding up sum and prod
On Aug 23, 2010, at 1:19 PM, Radford Neal wrote:
Looking for more ways to speed up R, I've found that large improvements are possible in the speed of "sum" and "prod" for long real vectors.
The results are likely very compiler- and architecture specific. On my machine [x86_64, OS X 10.6] your version is actually slower for narm (the more you optimize the more assumptions you are making which may turn out to be false): baseline:
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 2.412 0.018 2.431
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 2.510 0.001 2.509 RN version:
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 1.691 0.004 1.694
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 3.527 0.001 3.528 If you simply take out the narm check and don't mess with updated you get to a more reasonable:
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 1.688 0.003 1.691
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 2.522 0.000 2.522 But just to bring things into perspective -- simply changing the compiler [here just for that one file with still the same optimization level] will give you:
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 5.098 0.003 5.102
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 5.213 0.000 5.214 ... and using the original (unmodified) R code with a more optimized flags will give you
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 1.835 0.003 1.838
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 2.473 0.001 2.474 ... and the "optimal" version above (3rd) with the same optimization settings:
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 1.670 0.003 1.673
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 5.555 0.001 5.556 ... so you just can't win ... Cheers, Simon
Here is a little test with R version 2.11.1 on an Intel Linux system
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 4.800 0.010 4.817
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed 8.240 0.030 8.269 and here is the same with "sum" and "prod" modified as described below:
a <- seq(0,1,length=1000)
system.time({for (i in 1:1000000) b <- sum(a)})
user system elapsed 1.81 0.00 1.81
system.time({for (i in 1:1000000) b <- sum(a,na.rm=TRUE)})
user system elapsed
7.250 0.010 7.259
That's an improvement by a factor of 2.65 for real vectors of length
1000 with na.rm=FALSE (the default), and an improvement of 12% when
na.rm=TRUE. Of course, the improvement is smaller for very short
vectors.
The biggest reason for the improvement is that the current code (in
2.11.1 and in the development release of 2010-08-19) makes a costly
call of ISNAN even when the option is na.rm=FALSE. The inner loop
can also be sped up a bit in other respects.
Here is the old procedure, in src/main/summary.c:
static Rboolean rsum(double *x, int n, double *value, Rboolean narm)
{
LDOUBLE s = 0.0;
int i;
Rboolean updated = FALSE;
for (i = 0; i < n; i++) {
if (!ISNAN(x[i]) || !narm) {
if(!updated) updated = TRUE;
s += x[i];
}
}
*value = s;
return(updated);
}
and here is my modified version:
static Rboolean rsum(double *x, int n, double *value, Rboolean narm)
{
LDOUBLE s = 0.0;
int i;
Rboolean updated = FALSE;
if (narm) {
for (i = 0; i < n; i++) {
if (!ISNAN(x[i])) {
s += x[i];
updated = TRUE;
break;
}
}
for (i = i+1; i < n; i++) {
if (!ISNAN(x[i]))
s += x[i];
}
} else {
for (i = 0; i < n; i++)
s += x[i];
if (n>0) updated = TRUE;
}
*value = s;
return(updated);
}
An entirely analogous improvement can be made to the "prod" function.
Radford Neal
______________________________________________ R-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel