[Rcpp-devel] How to increase the coding efficiency
Hi Honglang, I recommend looking into using the solve() function in Armadillo, instead of inv(). Using solve() will be considerably faster. More details here: http://arma.sourceforge.net/docs.html#solve On Thursday, December 6, 2012, Honglang Wang <wanghonglang2008 at gmail.com> wrote:
The following is a full example although I don't know whether it's
minimal or not:
library(Rcpp)
library(RcppArmadillo)
sourceCpp("betahat_mod.cpp")
#The following is data generation.
n=200
m=20
p=2
t=runif(m*n,min=0, max=1)
X1=rnorm(m*n,0,1)
X1=as.matrix(1+2*exp(t)+X1)
X2=rnorm(m*n,0,1)
X2=as.matrix(3-4*t^2+X2)
X=cbind(X1,X2)
beta1=0.5*sin(t)
beta2=beta1
rho=0.2
sig2=1
eps=unlist(lapply(1:n,function(x){as.matrix(arima.sim(list(ar=rho),m,sd=sqrt(sig2*(1-rho^2))))}))
y=as.matrix(X1*beta1+X2*beta2+eps)
#A simple kernel function.
ker=function(x,h)
{
ans=x
lo=(abs(x)<h)
ans[lo]=(3/4)*(1-(x[lo]/h)^2)/h
ans[!lo]=0
return(ans)
}
h=0.3
#assess the time for evaluating bethat of 100 t's:
system.time((betahatt=t(apply(as.matrix(t[1:100]),1,function(x)
betahat(ker,x,X,y,t,h,m)$betahat))))
And the .cpp file is the following: // [[Rcpp::depends(RcppArmadillo)]] #include <RcppArmadillo.h> using namespace Rcpp; // [[Rcpp::export]] List betahat(Function ker, double t0, NumericMatrix Xr, NumericMatrix yr,
NumericVector tr, double h, int m) {
int n = Xr.nrow(), p = Xr.ncol();
arma::mat X(Xr.begin(), n, p, false);
arma::mat y(yr.begin(), n, 1, false);
arma::colvec t(tr.begin(), tr.size(), false);
arma::mat T = X;
T.each_col() %= (t-t0)/h;
arma::mat D = arma::join_rows(X,T);
arma::vec kk =as<arma::vec>(ker(tr-t0,h));
arma::mat W = (arma::diagmat(kk))/m;
arma::mat Inv = arma::trans(D)*W*D;
arma::vec betahat = arma::inv(Inv)*arma::trans(D)*W*y;
arma::colvec betahat0(betahat.begin(),betahat.size()/2,false);
return List::create(Named("betahat") = betahat0);
}
Best wishes!
Honglang Wang
Office C402 Wells Hall
Department of Statistics and Probability
Michigan State University
1579 I Spartan Village, East Lansing, MI 48823
wangho16 at msu.edu
On Wed, Dec 5, 2012 at 4:07 AM, Christian Gunning <xian at unm.edu> wrote:
Can you post a minimal full example? -Christian On Tue, Dec 4, 2012 at 8:39 PM, Honglang Wang <wanghonglang2008 at gmail.com> wrote:
Yes, the main issue for my coding is the allocation of memory. And I
have
fixed one of the biggest memory allocation issue: 4000 by 4000 diagonal matrix. And since I am not familiar with Rcpp and RcppArmadillo, I
have no
idea how to reuse the memory. I hope I can have some materials to learn this. Thanks.
What exactly do these timings show? A single call you your function? How many calls?
Here I called my function for 100 times.
Building on Romain's point: -- a portion of your function's runtime is in memory allocation (and you have a lot of allocations here). If you're calling your function thousands or millions of times, then it might pay to closely examine your memory allocation strategies and figure out what's temporary, for example. It looks like you're already using copy_aux_mem = false in a number of places, but you're allocating a lot of objects -- of approx what size? For example, wouldn't this work just as well with one less allocation? arma::vec kk = t; arma::uvec q1 = arma::find(arma::abs(tp)<h); kk.elem(q1) = ((1-arma::pow(tp.elem(q1)/h,2))/h)*0.75; // done with q1. let's reuse it. q1 = arma::find(arma::abs(tp)>=h); // was q2 kk.elem(q1).zeros(); You could potentially allocate memory for temporary working space in R, grab it with copy_aux_mem = false, write your temp results there, and reuse these objects in subsequent function calls. It doesn't make sense to go to this trouble, though, if your core algorithm consumes the bulk of runtime. Have you looked on the armadillo notes r.e. inv? Matrix inversion has O(>n^2). You may be aided by pencil-and-paper math here. http://arma.sourceforge.net/docs.html#inv
Here my matrix for inverse is only 4 by 4, so I think it's ok.
best, Christian
Dear All, I have tried out the first example by using RcppArmadillo, but I am
not
sure whether the code is efficient or not. And I did the comparison
of
the computation time. 1) R code using for loop in R: 87.22s 2) R code using apply: 77.86s 3) RcppArmadillo by using for loop in C++: 53.102s 4) RcppArmadillo together with apply in R: 47.310s It is kind of not so big increase. I am wondering whether I used an inefficient way for the C++ coding:
-- A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal ? Panama!
-- A man, a plan, a cat, a ham, a yak, a yam, a hat, a canal ? Panama!
-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20121210/124ff8b5/attachment.html>