[Bioc-devel] C++ code performance issues
Hi Martin, thanks for the tips. I did a bit more investigation and it showed up that the development version of R is not compiling with optimization flags while installing the packages. I am not sure whether this was also the case initially, but I know for sure that it was using -O3 when running CMD check, maybe I just got confused and never noticed that it's not using it during the installation. Is it safe to assume that optimization flags will be used in the stable release version, or is it better to specify the in the package's Makevars? Peter.
On 21/03/13 19:36, Martin Morgan wrote:
On 03/21/2013 11:30 AM, Peter Glaus wrote:
Hi, I am working on BitSeq package, which has both command line C++ version and Bioconductor version in which R calls the same C++ code with .C function. While testing the development version of package on R 3.0.0 I noticed that the "R version" runs much slower: 2-3 TIMES slower than the pure C++ implementation. Interestingly, the stable release of the "R version" seems to be as fast as C++ version. (The underlying code has changed slightly but there shouldn't be much difference) Is there any reason for such behavior? Has anyone encountered similar issue? Is there a way to make the C++ code called from R faster? More details: I compiled the C++ code with same g++ flags (... -O3 -pipe -fpic -g... ) and removed OpenMP support from both. The functions take exactly the same input (input is read from a file), and produce exactly same output (using same seed). A specific computation that took the C++ version 12minutes, took the R(C++) version 47minutes. There is no IO during that part of the code and there was just one R_CheckUserInterrupt() call during this time (I changed the code, so that there would not be many of these calls.). There are just few differences in the last stable release and that seems to run even faster than current C++ (10m). (The stable release uses -O2 while compiling the c++ code.)
Can you narrow this down to something more reproducible, e.g., a particular call that causes problems, including the platform(s) on which you are seeing issues? Maybe you're running out of memory (because R is holding memory that the command line does not access)? Probably you spend most of your time 'in C' or 'in R', rather than moving between them? You could try, on linux / mac, a cheap C-level guesstimate of where time is spent by running under gdb R -d gdb (gdb) run and then periodically breaking with cntrl-C and looking where you are (gdb) backtrace ## stack trace (gdb) continue and comparing the same under the commandline
> gdb ./bitseq
or doing some more serious profiling as outlines in section 3.4 of 'Writing R Extensions"; probably you would start by getting a short reproducible example. Martin
Thanks, Peter.
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel