Skip to content
Prev 5051 / 10988 Next

[Rcpp-devel] performance profile from Rcpp or RcppArmadillo code? Matrix subviews, etc

Hi Mario:
On Sat, Jan 5, 2013 at 9:09 PM, Mario Bourgoin <m.bourgoin at gmail.com> wrote:
I agree, that writeup is a nice survey.

I'm stuck at a more fine-grained situation, where I really want a
profile of a large function using Armadillo and am as frustrated as
hell trying to get one.

Following R Writing Extensions, I am trying sprof and oprofile.

sprof seems to run, but never generates the profile output. I follow
this approach (http://greg-n-blog.blogspot.com/2010/01/profiling-shared-library-on-linux-using.html)
for sprof.  It makes the R test job run slower, I just can't find the
output file. There are no error messages.

$ export LD_PROFILE=/home/pauljohn/.../Amelia.so
$ export LD_PROFILE_OUTPUT=`pwd`

$ R -f testCode.R

But, for whatever reason, no profile file appears.  I'm still checking
into it. I've tried the advice in Writing R Extensions about writing
the profile into /var/tmp/... after creating a directory as root,
still same outcome. I feel sure I'm missing something very obvious.

oprofile looks exciting, if you have the newer edition (0.9.8).

In Debian, we don't have a pre-built package for oprofile-0.9.8, so
I've compiled that and it is interesting, but very complicated. It
gives a nice general overview that says the program is spending most
of its time in BLAS and LAPACK, but it doesn't help me very much. I
need to know which parts of the CPP code are causing all that access
to BLAS or LAPACK.

In case you have not seen oprofile...

oprofile-0.9.8 has a new function "operf" that can run WITHOUT ROOT
ACCESS and it generates a large-ish folder of profiler output:

$ operf --callgraph R -f profile-AmeliaSpeedup-4.R

After that, however, is where I'm frustrated because I can't manage
the output. In case  you've never seen this output, here's a quick
paste. Perhaps other list readers will have pointers for me on how to
more wisely use opreport to trim down the time spent in the functions
inside the package shared library that I'm studying right now, which
is Amelia.so.

$ opreport --callgraph  | more
Using /home/pauljohn/sync/work/R-Amelia-Armadillo/oprofile_data/samples/
for samples directory.
warning: /no-vmlinux could not be found.
CPU: Intel Sandy Bridge microarchitecture, speed 2.701e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a
unit mask of 0x00 (No unit mask) count 90000
warning: dropping hyperspace sample at offset 21e4c8 >= 221d8 for
binary /lib/x86_64-linux-gnu/ld-2.13.so
warning: dropping hyperspace sample at offset 21e4c8 >= 221d8 for
binary /lib/x86_64-linux-gnu/ld-2.13.so
warning: dropping hyperspace sample at offset 10fd0 >= fb7e for binary
/home/pauljohn/R/x86_64-pc-linux-gnu-library/2.15/Amelia/libs/Amelia.so
warning: dropping hyperspace sample at offset 1100 >= 1048 for binary
/lib/x86_64-linux-gnu/libnss_compat-2.13.so
warning: dropping hyperspace sample at offset 11a0 >= 1048 for binary
/lib/x86_64-linux-gnu/libnss_compat-2.13.so
samples  %        image name               app name                 symbol name
-------------------------------------------------------------------------------
  11       100.000  libc-2.13.so             R                        malloc
31109    31.5168  libblas.so.3.0           R
/usr/lib/atlas-base/atlas/libblas.so.3.0
  31109    100.000  libblas.so.3.0           R
/usr/lib/atlas-base/atlas/libblas.so.3.0 [self]
-------------------------------------------------------------------------------
  7        100.000  libRcpp.so               R
virtual thunk to Rcpp::Rostream<true>::~Rostream()
18613    18.8570  libR.so                  R
/usr/lib/R/lib/libR.so
  18613    100.000  libR.so                  R
/usr/lib/R/lib/libR.so [self]
-------------------------------------------------------------------------------
  37       100.000  libc-2.13.so             R                        malloc
17443    17.6717  liblapack.so.3.0         R
/usr/lib/lapack/liblapack.so.3.0
  17443    100.000  liblapack.so.3.0         R
/usr/lib/lapack/liblapack.so.3.0 [self]
-------------------------------------------------------------------------------
  4521     100.000  libc-2.13.so             R
_int_malloc
4521      4.5803  libc-2.13.so             R                        _int_malloc
  4521     48.4306  libc-2.13.so             R
_int_malloc
  4521     48.4306  libc-2.13.so             R
_int_malloc [self]
  293       3.1387  no-vmlinux               R
/no-vmlinux
-------------------------------------------------------------------------------
  2789     100.000  Amelia.so                R
arma::subview_elem2<double, arma::Mat<unsigned int>,
 arma::Mat<unsigned int> >::extract(arma::Mat<double>&,
arma::subview_elem2<double, arma::Mat<unsigned int>, arma::Mat<uns
igned int> > const&)
  2789     49.9105  Amelia.so                R
arma::subview_elem2<double, arma::Mat<unsigned int>,
 arma::Mat<unsigned int> >::extract(arma::Mat<double>&,
arma::subview_elem2<double, arma::Mat<unsigned int>, arma::Mat<uns
igned int> > const&)
  2789     49.9105  Amelia.so                R
arma::subview_elem2<double, arma::Mat<unsigned int>,
 arma::Mat<unsigned int> >::extract(arma::Mat<double>&,
arma::subview_elem2<double, arma::Mat<unsigned int>, arma::Mat<uns
igned int> > const&) [self]
  10        0.1790  no-vmlinux               R
/no-vmlinux

...

THe output has a bunch of stanzas like this, and I have trouble
understanding what the output means.  The top line says there are 621
calls to line in the shared library that uses arma::subview_elem2.
What do the lines after that mean, where it has 0.6291 on line 2 and
49.9568 (repeated on 2 lines in a row)

-------------------------------------------------------------------------------
  621      100.000  Amelia.so                R
void arma::subview_elem2<double, arma::Mat<unsigned
int>, arma::Mat<unsigned int> >::inplace_op<arma::op_subview_elem_equ,
arma::Op<arma::Mat<double>, arma::op_inv_sympd> >(a
rma::Base<double, arma::Op<arma::Mat<double>, arma::op_inv_sympd> > const&)
621       0.6291  Amelia.so                R
void arma::subview_elem2<double, arma::Mat<unsigned in
t>, arma::Mat<unsigned int> >::inplace_op<arma::op_subview_elem_equ,
arma::Op<arma::Mat<double>, arma::op_inv_sympd> >(arm
a::Base<double, arma::Op<arma::Mat<double>, arma::op_inv_sympd> > const&)
  621      49.9598  Amelia.so                R
void arma::subview_elem2<double, arma::Mat<unsigned
int>, arma::Mat<unsigned int> >::inplace_op<arma::op_subview_elem_equ,
arma::Op<arma::Mat<double>, arma::op_inv_sympd> >(a
rma::Base<double, arma::Op<arma::Mat<double>, arma::op_inv_sympd> > const&)
  621      49.9598  Amelia.so                R
void arma::subview_elem2<double, arma::Mat<unsigned
int>, arma::Mat<unsigned int> >::inplace_op<arma::op_subview_elem_equ,
arma::Op<arma::Mat<double>, arma::op_inv_sympd> >(a
rma::Base<double, arma::Op<arma::Mat<double>, arma::op_inv_sympd> >
const&) [self]
  1         0.0805  no-vmlinux               R
/no-vmlinux


Oh,well.  It is fun when little glimmers of insight flash bye :)

pj