[Rcpp-devel] [R] Find number of elements less than some number: Elegant/fastsolution needed

Sun, Apr 17, 2011 11:00 AM

I enclose some comparisons, admittedly on only one size of example (10
million in the reference distribution and 100,000 in the test sample).

The versions using Rcpp and the std algorithms came out about 3 times
as fast as the version using R's findInterval.  What surprises me is
that the sequential comparisons in C++ (version k) is slightly faster
than the binary search versions (h and j).  The unassisted binary
search requires 30 probes for each element of ans and the sequential
comparisons show take an average of 100 comparisons.  I suppose the
difference is sequential access versus random access to the elements
of y.

On Fri, Apr 15, 2011 at 7:34 AM, Douglas Bates <bates at stat.wisc.edu> wrote:

On Thu, Apr 14, 2011 at 11:47 PM, Christian Gunning <xian at unm.edu> wrote:

On Thu, Apr 14, 2011 at 7:02 PM,
<rcpp-devel-request at r-forge.wu-wien.ac.at> wrote:

I was able to write a very short C++ function using the Rcpp package
that provided about a 1000-fold increase in speed relative to the best
I could do in R. ?I don't have the script on this computer so I will
post it tomorrow when I am back on the computer at the office.

Apologies for cross-posting to the Rcpp-devel list but I am doing so
because this might make a good example of the usefulness of Rcpp and
inline.

And RcppArmadillo, as the case may be.

This is a cool little problem. ?In the examples given, I'd caution
people against comparing apples and durian. ?The sort(x) is a cost
that should be considered *within* each implementation. I used
Armadillo to sort (src, f4), and get another 100% worth of speedup
that I can't reproducing using R's sort (src1, f1-f3). ?If i modify
SEXP in-place (and this always confuses me, so I tend to avoid it),
I'm seeing an additional ~5-10% speed gain (src2, f5) -- the advantage
of this last seems to be primarily in memory-constrained applications.

On to the code!

src = '
NumericVector xx_(clone(x)), yy_(clone(y));
int nxx = xx_.size();
int nyy = yy_.size();
arma::vec xx(xx_), yy(yy_);
yy = sort(yy);
xx = sort(xx);
//
//
int j = 0; //gt index for yy
for (int i=0; i < nxx; i++) {
? ?while ((j < nyy) && ( xx(i) > yy(j) ) ) {
? ? ? ?j++;
? ?}
? ?xx_(i) = j;
? }
return (xx_);
'

src1 = '
NumericVector xx_(clone(x)), yy_(clone(y));
// assumes x & y are already sorted
arma::vec xx(xx_), yy(yy_);
int nxx = xx.n_elem;
int nyy = yy.n_elem;
int j = 0; //gt index for yy
for (int i=0; i < nxx; i++) {
? ?while ((j < nyy) && ( xx(i) > yy(j) ) ) {
? ? ? ?j++;
? ?}
? ?xx_(i) = j;
?}
return (xx_);
'

src2 = '
NumericVector xx_(x), yy_(y); ?//kinda scary
int nxx = xx_.size();
int nyy = yy_.size();
arma::vec xx(xx_.begin(), nxx, false), yy(yy_.begin(), nyy, false);
//really kinda scary
yy = sort(yy);
xx = sort(xx);
//
//
int j = 0; //gt index for yy
for (int i=0; i < nxx; i++) {
? ?while ((j < nyy) && ( xx(i) > yy(j) ) ) {
? ? ? ?j++;
? ?}
? ?xx_(i) = j;
}
return (xx_);
'

require(inline)
require(RcppArmadillo)
f1 <- function(x, y) { sort(length(y) - findInterval(-x, rev(-sort(y))));}
f2 <- function(x, y) {x = sort(x); length(y) - findInterval(-x, rev(-sort(y)))}
f3.a <- cxxfunction(signature(x="numeric", y="numeric"), src1,
plugin='RcppArmadillo')
f3 <- function(x,y) {
? ? ? ?x <- sort(x)
? ? ? ?y <- sort(y)
? ? ? ?return(f3.a(x,y))
}
f4 <- cxxfunction(signature(x="numeric", y="numeric"), src,
plugin='RcppArmadillo')
## ?danger -- violates R semantics
f5 <- cxxfunction(signature(x="numeric", y="numeric"), src2,
plugin='RcppArmadillo')


## this is a really ugly test. ygwypf, i suppose :)

for (i in 1:5) {
?x1 <- x <- rnorm(5e6)
?y1 <- y <- rnorm(5e6)
?print( cbind(
? ?r1=system.time(r1 <- f1(x,y)),
? ?r2=system.time(r2 <- f2(x,y)), r3=system.time(r3 <- f3(x1,y1)),
? ?r4 = system.time(r4 <- f4(x,y)), r5 = system.time(r5 <- f5(x,y))
?))
}
print(all.equal(r1, r2))
print(all.equal(r1, r3))
print(all.equal(r1, r4))
print(all.equal(r1, r5))

I agree that the cost of sorting should be taken into account but I
don't think you need to go to the RcppArmadillo package to get a sort
function. ?Why not use std::sort?

Also, I did sequential comparisons as shown in your code but after
reading Bill Dunlap's response and looking at the documentation for
the findInterval ?function in R I smacked myself on the forehead and
thought "Duh - binary search, of course".

I haven't looked at the C code underlying the findInterval function
yet so I don't know if Martin has clever tricks for sorted x and y.
However the documentation for the std::upper_bound template at
cplusplus.com shows how to use that for the case here. ?The best I can
think of for sorted x and y is to pass the upper bound from x[i] as
the first argument in the call to std::upper_bound for x[i+1].

Unfortunately I am staring at a series of deadlines today so
implementations and comparisons may need to wait until tomorrow.

P.S. to Christian: ?Check the archives for several of Dirk's posts to
the rcpp-devel list where he has used the rbenchmark package to
produce clean output from comparisons of ?implementations of
algorithms.

-------------- next part --------------

R version 2.13.0 (2011-04-13)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

Loading required package: inline

Loading required package: Rcpp

Loading required package: rbenchmark

+     ord <- order(x)
+     ans <- integer(length(x))
+     ans[ord] <- length(y) - findInterval(-x[ord], rev(-sort(y)))
+     ans
+ }

+ {
+     Rcpp::NumericVector x(x_),
+         y = clone(Rcpp::NumericVector(y_));
+     int n = x.size();
+     Rcpp::IntegerVector ans(n);
+     const Rcpp::NumericVector::iterator bb = y.begin(), ee = y.end();
+                // sort (a copy of) the y values for binary search
+     std::sort(bb, ee);
+     
+     for (int i = 0; i < n; i++) {
+ 	ans[i] = std::upper_bound(bb, ee, x[i]) - bb;
+     }
+     return ans;
+ }
+ ', plugin="Rcpp")

+ {
+     Rcpp::NumericVector x(x_),
+         y = clone(Rcpp::NumericVector(y_));
+     int n = x.size();
+     Rcpp::IntegerVector ans(n);
+     const Rcpp::NumericVector::iterator bb = y.begin(), ee = y.end();
+     Rcpp::NumericVector::iterator mm = y.begin();
+                // sort (a copy of) the y values for binary search
+     std::sort(bb, ee);
+     
+     for (int i = 0; i < n; i++) {
+         mm = std::upper_bound(mm, ee, x[i]);
+         ans[i] = mm - bb;
+     }
+     return ans;
+ }
+ ', plugin="Rcpp")

+     ord <- order(x)
+     ans <- integer(length(x))
+     ans[ord] <- j1(x[ord], y)
+     ans
+ }

+ {
+     Rcpp::NumericVector xs(xs_),
+         y = clone(Rcpp::NumericVector(y_));
+     int n = xs.size();
+     Rcpp::IntegerVector ord(ord_), ans(n);
+     const Rcpp::NumericVector::iterator bb = y.begin(), ee = y.end();
+     Rcpp::NumericVector::iterator yy = y.begin();
+     double *xp = xs.begin();
+                // sort (a copy of) the y values for binary search
+     std::sort(bb, ee);
+     
+     for (int i = 0; i < n && yy < ee; i++, xp++) {
+ 	while (*xp >= *yy && yy < ee) yy++;
+ 	ans[ord[i] - 1] = yy - bb;
+     }
+     return ans;
+ }
+ ', plugin="Rcpp")

+     ord <- order(x)
+     k1(x[ord], y, ord)
+ }

[1] TRUE

[1] TRUE

[1] TRUE

[1] TRUE

[1] TRUE

+           order="relative", replications=10,
+           columns=c("test", "replications", "elapsed", "relative"))
     test replications elapsed relative
5 k(x, y)           10  19.477 1.000000
4 j(x, y)           10  19.984 1.026031
3 h(x, y)           10  20.107 1.032346
2 g(x, y)           10  57.691 2.962006
1 f(x, y)           10  60.563 3.109462

user  system elapsed 
213.060  15.880 233.067 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bb.R
Type: application/octet-stream
Size: 2865 bytes
Desc: not available
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110417/cc479b0f/attachment.obj>

[Rcpp-devel] [R] Find number of elements less than some number: Elegant/fastsolution needed

Thread (3 messages)