Skip to content

[Rcpp-devel] How to increase the coding efficiency

12 messages · Christian Gunning, Honglang Wang, Tim Triche, Jr. +5 more

#
Can you post a minimal full example?
-Christian

On Tue, Dec 4, 2012 at 8:39 PM, Honglang Wang
<wanghonglang2008 at gmail.com> wrote:

  
    
#
The following is a full example although I don't know whether it's minimal
or not:

library(Rcpp)
library(RcppArmadillo)
sourceCpp("betahat_mod.cpp")

#The following is data generation.
  n=200
  m=20
  p=2
  t=runif(m*n,min=0, max=1)
  X1=rnorm(m*n,0,1)
  X1=as.matrix(1+2*exp(t)+X1)
  X2=rnorm(m*n,0,1)
  X2=as.matrix(3-4*t^2+X2)
  X=cbind(X1,X2)
  beta1=0.5*sin(t)
  beta2=beta1
  rho=0.2
  sig2=1

eps=unlist(lapply(1:n,function(x){as.matrix(arima.sim(list(ar=rho),m,sd=sqrt(sig2*(1-rho^2))))}))
  y=as.matrix(X1*beta1+X2*beta2+eps)

#A simple kernel function.
ker=function(x,h)
  {
    ans=x
    lo=(abs(x)<h)
    ans[lo]=(3/4)*(1-(x[lo]/h)^2)/h
    ans[!lo]=0
    return(ans)
  }


h=0.3

#assess the time for evaluating bethat of 100 t's:
system.time((betahatt=t(apply(as.matrix(t[1:100]),1,function(x)
betahat(ker,x,X,y,t,h,m)$betahat))))

And the .cpp file is the following:

// [[Rcpp::depends(RcppArmadillo)]]

#include <RcppArmadillo.h>

using namespace Rcpp;

// [[Rcpp::export]]
List betahat(Function ker, double t0, NumericMatrix Xr, NumericMatrix yr,
NumericVector tr, double h, int m) {
  int n = Xr.nrow(), p = Xr.ncol();
  arma::mat X(Xr.begin(), n, p, false);
  arma::mat y(yr.begin(), n, 1, false);
  arma::colvec t(tr.begin(), tr.size(), false);
  arma::mat T = X;
  T.each_col() %= (t-t0)/h;
  arma::mat D = arma::join_rows(X,T);
  arma::vec kk =as<arma::vec>(ker(tr-t0,h));
  arma::mat W = (arma::diagmat(kk))/m;
  arma::mat Inv = arma::trans(D)*W*D;
  arma::vec betahat = arma::inv(Inv)*arma::trans(D)*W*y;
  arma::colvec betahat0(betahat.begin(),betahat.size()/2,false);
  return List::create(Named("betahat") = betahat0);
}


Best wishes!

Honglang Wang

Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu
On Wed, Dec 5, 2012 at 4:07 AM, Christian Gunning <xian at unm.edu> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121205/c36a832e/attachment.html>
#
this part will always make your code crawl along:

On Wed, Dec 5, 2012 at 11:09 AM, Honglang Wang
<wanghonglang2008 at gmail.com>wrote:

            
first time I wrote a GLM engine, I wrote it the way statistics books
illustrate it (i.e. actually invert things to do IWLS) and it crawled.  I
took it to a physicist friend who went through alternate stages of disgust
and laughter then told me never to invert anything I didn't have to.

You should take Doug Bates' advice, it could save you a great deal of time.
 Never bit fiddle when you can use a better algorithm.
#
I'd be very interested to see the before/after version of your code
code, Honglang. With timings. It'll make a good blog post or academic
paper (depending on just how many learnings you get from the process) :-)

Darren
#
I don't know as much about Armadillo as I do about Eigen so I will cheat
and write using RcppEigen instead of RcppArmadillo.  Page 6 of the Eigen
tutorial at http://eigen.tuxfamily.org/dox/ discusses decompositions and
solving linear systems of equations.  One of the simplest ways of solving a
weighted least squares system, if you only want the solution itself, is
with a QR decomposition of which there are 3 in Eigen.  The simplest is
HouseholderQR (named after A.S. Householder who first described an
elementary reflector).  For a least squares solution (assuming the model
matrix has full column rank) you simply use the solve method. For a
weighted least squares solution you premultiply both the model matrix and
the response by the square roots of the weights.

It can be collapsed to a one-liner, as in the enclosed.  A test that it
works for unweighted least squares (i.e. using unit weights) is
0.005085714 0.876285714
length(optden))))
$betahat
[1] 0.005085714 0.876285714

The use of the other decompositions (Cholesky, ColPivHouseholderQR,
Symmetric Eigendecomposition, SVD) are described in the RcppEigen vignette.

By the way, the sourceCpp, evalCpp and cppFunction capabilities developed
by the Rcpp authors is very close to magic.  They are to be congratulated
on a remarkable accomplishment.
On Wed, Dec 5, 2012 at 1:16 PM, Tim Triche, Jr. <tim.triche at gmail.com>wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121206/e6b3c442/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: wtls.cpp
Type: text/x-c++src
Size: 494 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121206/e6b3c442/attachment.cpp>
#
On 6 December 2012 at 12:33, Douglas Bates wrote:
| // [[Rcpp::depends(RcppEigen)]]
| #include <RcppEigen.h>
| 
| typedef Eigen::MatrixXd                 Mat;
| typedef Eigen::Map<Mat>                MMat;
| typedef Eigen::HouseholderQR<Mat>        QR;
| typedef Eigen::VectorXd                 Vec;
| typedef Eigen::Map<Vec>                MVec;
| 
| // [[Rcpp::export]]
| 
| Rcpp::List wtls(const MMat X, const MVec y, const MVec sqrtwts) {
|     return Rcpp::List::create(Rcpp::Named("betahat") =
| 			      QR(sqrtwts.asDiagonal()*X).solve(sqrtwts.asDiagonal()*y));
| }

That is a thing of beauty.  Nicely done, Doug!

Dirk
#
Thanks Douglas, I will study your code and Eigen carefully.

Best wishes!

Honglang Wang

Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu
On Thu, Dec 6, 2012 at 1:33 PM, Douglas Bates <bates at stat.wisc.edu> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121206/78258caa/attachment-0001.html>
3 days later
c s
#
Hi Honglang,

I recommend looking into using the solve() function in Armadillo, instead
of inv(). Using solve() will be considerably faster.

More details here:
http://arma.sourceforge.net/docs.html#solve


On Thursday, December 6, 2012, Honglang Wang <wanghonglang2008 at gmail.com>
wrote:
minimal or not:
eps=unlist(lapply(1:n,function(x){as.matrix(arima.sim(list(ar=rho),m,sd=sqrt(sig2*(1-rho^2))))}))
betahat(ker,x,X,y,t,h,m)$betahat))))
NumericVector tr, double h, int m) {
have
have no
not
of
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121210/124ff8b5/attachment.html>
1 day later
#
I tried out that if the dim of the matrix less than or equal to 4, inv()
and solve() have the same speed.

Best wishes!

Honglang Wang

Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu
On Mon, Dec 10, 2012 at 8:05 AM, c s <conradsand.arma at gmail.com> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121211/1480a2e8/attachment.html>
#
Dear All,
I have modified the code as following. Since I found the old code had the
memory allocation issue, I calculated each term of the matrix with dim 4.
The speed is pretty much good now. But since I am not familiar with Rcpp,
the code sometimes will get error, like

 *** caught segfault ***
address 0x16, cause 'memory not mapped'
Error in betahat(ker, inv[l], cbind(inv[1:ll], inv[(ll + 1):(2 * ll)]),  :
  badly formed function expression
Calls: system.time ... apply -> FUN -> betahat -> .External -> cpp_exception
Execution halted

I am not sure whether my following code has some big issue mentioned in the
above error hint. I repeatedly call the following function in my entire
code.

// [[Rcpp::depends(RcppArmadillo)]]

#include <RcppArmadillo.h>

using namespace Rcpp;

// [[Rcpp::export]]
List betahat(Function ker, double t0, NumericMatrix Xr, NumericMatrix yr,
NumericVector tr, double h, int m) {
  int n = Xr.nrow(), p = Xr.ncol();
  arma::mat X(Xr.begin(), n, p, false);
  arma::mat y(yr.begin(), n, 1, false);
  arma::colvec t(tr.begin(), tr.size(), false);
  arma::mat T = X;
  T.each_col() %= (t-t0)/h;
  arma::vec K =as<arma::vec>(ker(tr-t0,h))/m;
  double L1 = arma::accu(K%X.col(0)%X.col(0));
  double L2 = arma::accu(K%X.col(0)%X.col(1));
  double L3 = arma::accu(K%X.col(1)%X.col(1));
  double L4 = arma::accu(K%X.col(0)%T.col(0));
  double L5 = arma::accu(K%X.col(1)%T.col(0));
  double L6 = arma::accu(K%X.col(1)%T.col(1));
  double L7 = arma::accu(K%T.col(0)%T.col(0));
  double L8 = arma::accu(K%T.col(0)%T.col(1));
  double L9 = arma::accu(K%T.col(1)%T.col(1));
  double R1 = arma::accu(K%X.col(0)%y);
  double R2 = arma::accu(K%X.col(1)%y);
  double R3 = arma::accu(K%T.col(0)%y);
  double R4 = arma::accu(K%T.col(1)%y);
  arma::mat L(2*p,2*p);
  L(0,0)=L1;L(0,1)=L2;L(0,2)=L4;L(0,3)=L5;
  L(1,0)=L2;L(1,1)=L3;L(1,2)=L5;L(1,3)=L6;
  L(2,0)=L4;L(2,1)=L5;L(2,2)=L7;L(2,3)=L8;
  L(3,0)=L5;L(3,1)=L6;L(3,2)=L8;L(3,3)=L9;
  arma::mat R(2*p,1);
  R(0,0)=R1;R(1,0)=R2;R(2,0)=R3;R(3,0)=R4;
  arma::vec betahat = arma::solve(L,R);
  arma::colvec betahat0(betahat.begin(),betahat.size()/2,false);
  return List::create(Named("betahat") = betahat0);
}

Best wishes!

Honglang Wang

Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu
On Thu, Dec 6, 2012 at 2:10 AM, Darren Cook <darren at dcook.org> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121211/5fc6eda4/attachment-0001.html>
#
And actually, i got three different types of errors:

1)Error in betahat(ker, inv[l], cbind(inv[1:ll], inv[(ll + 1):(2 * ll)]),  :
  unused argument(s) (n = 1)
Calls: system.time ... apply -> FUN -> betahat -> .External -> cpp_exception
Execution halted

2)Error in betahat(ker, inv[l], cbind(inv[1:ll], inv[(ll + 1):(2 * ll)]),  :
  promise already under evaluation: recursive default argument reference or
ear\
lier problems?
Calls: system.time ... apply -> FUN -> betahat -> .External -> cpp_exception
Execution halted

3)*** caught segfault ***
address 0x1, cause 'memory not mapped'
Error in betahat(ker, inv[l], cbind(inv[1:ll], inv[(ll + 1):(2 * ll)]),  :
  badly formed function expression
Calls: system.time ... apply -> FUN -> betahat -> .External -> cpp_exception
Execution halted

And I really have no idea what is the problem.

Best wishes!

Honglang Wang

Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu



On Tue, Dec 11, 2012 at 10:16 AM, Honglang Wang
<wanghonglang2008 at gmail.com>wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121212/7a047615/attachment.html>
#
The comment of Doug Bates on this thread motivated me to better examine
opportunities for algorithm improvements in my R package c++ code written
under Rcpp and RcppArmadillo.   I performed a few changes:  firstly, i
removed all general matrix inverse computations, replacing many with
triangular solving from cholesky decompositions (e.g. to conduct posterior
sampling from a multivariate gaussian where the precision matrix is readily
available).   secondly, instead of repeatedly performing high-dimensional
matrix computations composing sampled parameter and design objects during a
sequential sampling scan, I made the computation once per iteration and
then operated on the resulting large object by repeatedly extracting
portions, pre-sampling, and inserting back the post-sampled result during a
sequential scan.   thirdly,  while linear algebra compositions of
parameters and data objects must be rendered repeatedly across posterior
sampling iterations, compositions of design/data objects (with each other)
may be performed only once and re-used.   I had been sloppy and was
repeatedly performing computations purely involving data objects inside the
sampling loops.   I replaced this approach by slicing and dicing the design
objects and performing the matrix operations, once, and then repeatedly
inserting the objects during sampling.   While the first two steps -
particularly the second - produced some runtime speed improvements, I was
surprised that the third step produced only a very small improvement where
I had expected a large reduction in runtime.   I used RcppArmadillo and
placed these sliced and diced matrix objects into fields from which I then
extracted elements during sampling.   So I replaced matrix computations
with extractions from field containers.   Is it possible that extraction
speeds from fields and multiplying a chain of moderate-dimensioned matrices
(e.g. 5 x 600)  are of equivalent computational intensity?  Was the
compiler compensating for my hackery?


On Tue, Dec 11, 2012 at 7:07 AM, Honglang Wang
<wanghonglang2008 at gmail.com>wrote: