Skip to content

compiler flags for performance

5 messages · lejeczek, Arnaud Gaboury, Dirk Eddelbuettel

#
hi guys,

I'd like to ask, and I believe this place here should be best as who can
know better, if building R with different compilers and opt flags is
something worth investing time into?

Or maybe this a subject that somebody has already investigated. If yes
what then are the conclusion?

Reason I ask is such that, on Centos 7.6 with different compilers from
stock repo but also from so called software collections, do not
render(with flags for performance) an R binaries which would perform any
better, according to R-benchmark-25 at least, then "vanilla" packages
shipped from distro.

And that makes me curious - is it because R is such a case which is
prone to any compiler performance optimizations?

Maybe there is more structured and organized way to conduct such
different-compilers-optimizations benchmarks/test?

What do devel can say and advise with regards to compile-for-performance
subject?

many thanks, L.?
#
On 13 June 2019 at 16:05, lejeczek via R-devel wrote:
| I'd like to ask, and I believe this place here should be best as who can
| know better, if building R with different compilers and opt flags is
| something worth investing time into?
| 
| Or maybe this a subject that somebody has already investigated. If yes
| what then are the conclusion?
| 
| Reason I ask is such that, on Centos 7.6 with different compilers from
| stock repo but also from so called software collections, do not
| render(with flags for performance) an R binaries which would perform any
| better, according to R-benchmark-25 at least, then "vanilla" packages
| shipped from distro.
| 
| And that makes me curious - is it because R is such a case which is
| prone to any compiler performance optimizations?
| 
| Maybe there is more structured and organized way to conduct such
| different-compilers-optimizations benchmarks/test?
| 
| What do devel can say and advise with regards to compile-for-performance
| subject?

Of course you do that, and add those switches to ~/.R/Makeconf.  The
resulting binaries may become non-portable.

E.g. "at work" we use -march=native quite a bit but it means can't share
libraries from a beefier dev box with skinnier deployment boxen as they don't
have the same chipset even thought the are both x86_64 and use the same Linux
distro.  

As for which switches help in which way on different compiler: that is
probably best seen as a black box.  Time and profile locally, I no longer try
to generalize.   The newer 'link-time-optimizations' can help too, they
certainly make builds longer ...

Dirk
#
On 13/06/2019 16:14, Dirk Eddelbuettel wrote:
I've tried the "usual" tweaks and what puzzles me is the fact, that
-march=native and -lto(s) + Os/3 do not help much, make almost invisible
improvements (again, judging by results from R-benchmark-25) with gcc >=
7 as compared to distro's package which is built with -O2 -mtune=generic
and no ltos.

Would there be other(better) way to test core R?

What king of R perf increases do you guys see with compiler's opt flags,
if any?

regards, L.
#
On Fri, Jun 14, 2019 at 1:44 PM lejeczek via R-devel <r-devel at r-project.org>
wrote:
It is worth spending sometime,but all in all, you may end  disapointed.
There are other things you may try: new Intel Linux distro (optimized for
Intelprocessors); build with Clang compiler instead of GCC; use optimized
BLAS (that's indeed a very good idea,look for openblas).

I have build R with Intel MKL.The libraries are free for one year (maybe
did it change). The build is far from being trivial. Please find on my
github[0] some details

  
  
#
On 14 June 2019 at 15:22, arnaud gaboury wrote:
| I have build R with Intel MKL.The libraries are free for one year (maybe
| did it change). The build is far from being trivial. Please find on my
| github[0] some details

Gee, when oh when does this "meme" of "I built R with MKL" die?

BLAS/LAPACK are _an interface_ and once you tell R to configure with BLAS as
a shared library, _all_ matching BLAS/LAPACK libraries become _pluggable_. My
gcbd package and vignette demonstrated that a decade+ ago (then using Goto).
It also holds for MKL, and these days Intel tries harder with a) friendlier
licenses and b) better packaging -- they even give .deb (and I believe .rpm).

So now you just drop MKL in/out with a single script which you can find here
https://github.com/eddelbuettel/mkl4deb with supporting blog posts at
http://dirk.eddelbuettel.com/blog/2018/04/15#018_mkl_for_debian_ubuntu

So please, let's not repeat this 'you have to use Revolution / Microsoft /
$whatever R to get MKL' or 'you have to recompile R for MKL'.

Lastly, if it matters is up to the beholder. Because the optmization in the
MKL appears to come from _many_ explicit code paths for many Intel cpu
(micro-)architectures, the installed footprint is sizeable -- IIRC it was 2gb
when I wrote the blog post above.  My linear algebra use (at home) is light
so I just kept OpenBLAS which is almost as fast, and proper free software
with a smaller installation footprint. Your mileage, as they say, may vary.

Dirk