Skip to content
Prev 37513 / 63424 Next

Speeding up sum and prod

On Aug 23, 2010, at 1:19 PM, Radford Neal wrote:

            
The results are likely very compiler- and architecture specific. On my machine [x86_64, OS X 10.6] your version is actually slower for narm (the more you optimize the more assumptions you are making which may turn out to be false):

baseline:
user  system elapsed 
  2.412   0.018   2.431
user  system elapsed 
  2.510   0.001   2.509 

RN version:
user  system elapsed 
  1.691   0.004   1.694
user  system elapsed 
  3.527   0.001   3.528 

If you simply take out the narm check and don't mess with updated you get to a more reasonable:
user  system elapsed 
  1.688   0.003   1.691
user  system elapsed 
  2.522   0.000   2.522 


But just to bring things into perspective -- simply changing the compiler [here just for that one file with still the same optimization level] will give you:
user  system elapsed 
  5.098   0.003   5.102
user  system elapsed 
  5.213   0.000   5.214 

... and using the original (unmodified) R code with a more optimized flags will give you
user  system elapsed 
  1.835   0.003   1.838
user  system elapsed 
  2.473   0.001   2.474 

... and the "optimal" version above (3rd) with the same optimization settings:
user  system elapsed 
  1.670   0.003   1.673
user  system elapsed 
  5.555   0.001   5.556 

... so you just can't win ... 

Cheers,
Simon