Skip to content
Prev 11088 / 12125 Next

[R-pkg-devel] Cannot create C code with acceptable performance with respect to internal R command.

Luc,

There can be many reasons explaining the difference in compiled code 
performances. Tuning such code to achieve a pick performance is 
generally a fine art.
Optimizations techniques can include but are not limited to:
 ?- SIMD instructions (and memory alignment for their optimal use);
 ?- instruction level parallelism;
 ?- unrolling loops;
 ?- cache level (mis-)hits;
 ?- multi-thread parallelism;
 ?- ...
Approaches in optimization are not the same depending on kind of 
application: CPU-bound, memory-bound or IO-bound.
Many of this techniques can be directly used (or not) by compiler 
depending on chosen options. Are you sure to use the same options and 
compiler that were used during R compilation?
And finally, the compared code could be plainly not the same. R can use 
BLAS call, e.g. OpenBLAS to multiply two matrices. This latter is 
heavily optimized for such operations and can achieve x10 acceleration 
compared to plain "naive" BLAS.
The R code you cite can be just the code for a fallback in case no BLAS 
was found during R compilation.
Look at what your sessionInfo() says about used BLAS.

Best,
Serguei.

Le 05/12/2024 ? 14:21, Luc De Wilde a ?crit?: