Skip to content

.Call in R

7 messages · Raymond, Dirk Eddelbuettel, Karl Forner +2 more

#
Hi R developers,

    I am new to this forum and hope someone can help me with .Call in R.
Greatly appreciate any help!

    Say, I have a vector called "vecA" of length 10000, I generate a vector
called "vecR" with elements randomly generated from Uniform[0,1]. Both vecA
and vecR are of double type. I want to replace elements vecA by elements in
vecR only if sum of elements in vecR is greater than or equal to 5000.
Otherwise, vecR remain unchanged. This is easy to do in R, which reads
    vecA<-something;
    vecR<-runif(10000);
    if (sum(vecR)>=5000)){
       vecA<-vecR;
    }


    Now my question is, if I am going to do the same thing in R using .Call.
How can I achieve it in a more efficient way (i.e. less computation time
compared with pure R code above.).  My c code (called "change_vecA.c") using
.Call is like this:

    SEXP change_vecA(SEXP vecA){
         int i,vecA_len;
         double sum,*res_ptr,*vecR_ptr,*vecA_ptr;

         vecA_ptr=REAL(vecA);
         vecA_len=length(vecA);
         SEXP res_vec,vecR;

         PROTECT(res_vec=allocVector(REALSXP, vec_len));
         PROTECT(vecR=allocVector(REALSXP, vec_len));
         res_ptr=REAL(res_vec);
         vecR_ptr=REAL(vecR);
         GetRNGstate();
         sum=0.0;
         for (i=0;i<vecA_len;i++){
              vecR_ptr[i]=runif(0,1);
              sum+=vecR_ptr[i];
         }
         if (sum>=5000){
            /*copy vecR to the vector to be returned*/
            for (i=0;i<vecA_len;i++){
                  res_ptr[i]=vecR_ptr[i];
            }
         }
         else{
                /*copy vecA to the vector to be returned*/
                for (i=0;i<vecA_len;i++){
                      res_ptr[i]=vecA_ptr[i];
                }
         }

         PutRNGstate();
         UNPROTECT(2);
         resturn(res);
}
My R wrapper function is
        change_vecA<-function(vecA){
              dyn.load("change_vecA.so");
              .Call("change_vecA",vecA);
        }
   
         Now my question is, due to two loops (one generates the random
vector and one determines the vector to be returned), can .Call still be
faster than pure R code (only one loop to copy vecR to vecA given condition
is met)? Or, how can I improve my c code to avoid redundant loops if any. My
concern is if vecA is large (say of length 1000000 or even bigger), loops in
C code can slow things down.  Thanks for any help!  

          



--
View this message in context: http://r.789695.n4.nabble.com/Call-in-R-tp4080721p4080721.html
Sent from the R devel mailing list archive at Nabble.com.
#
On 17 November 2011 at 09:09, Raymond wrote:
| Hi R developers,
| 
|     I am new to this forum and hope someone can help me with .Call in R.
| Greatly appreciate any help!
| 
|     Say, I have a vector called "vecA" of length 10000, I generate a vector
| called "vecR" with elements randomly generated from Uniform[0,1]. Both vecA
| and vecR are of double type. I want to replace elements vecA by elements in
| vecR only if sum of elements in vecR is greater than or equal to 5000.
| Otherwise, vecR remain unchanged. This is easy to do in R, which reads
|     vecA<-something;
|     vecR<-runif(10000);
|     if (sum(vecR)>=5000)){
|        vecA<-vecR;
|     }
| 
| 
|     Now my question is, if I am going to do the same thing in R using .Call.
| How can I achieve it in a more efficient way (i.e. less computation time
| compared with pure R code above.).  My c code (called "change_vecA.c") using
| .Call is like this:

Here is my take on it, using about the same number of commands in C++ thanks
to Rcpp and its vectorised sum() and runif() commands (which mimick the R
commands):

R> library(inline)
R> library(Rcpp)
R> 
R> set.seed(42)         # fix RNG seed
R> vecA <- rt(10000, 6) # 'something' in vecA: t-dist with 6 df
R> 
R> fun <- cxxfunction(signature(va="numeric"), # pass in a vector
+                    plugin="Rcpp",      # use Rcpp, and code below
+                    body='
+ 
+    Rcpp::NumericVector vA(va);
+    Rcpp::RNGScope tmp;                     // make sure RNG is set up
+    Rcpp::NumericVector vR = runif(10000);  // 10k of a U(0,1)
+ 
+    if (sum(vR) >= 5000) {                  // sum is an Rcpp sugar op.
+       vA = vR;                             // swap vR into vA 
+    }
+    return(vA);                             // return vA
+ ')
R> 
R> sum( fun( vecA ) )
[1] 5033
R> sum( fun( vecA ) )
[1] 5015
R> sum( fun( vecA ) )
[1] 66
R> sum( fun( vecA ) )
[1] 66
R> sum( fun( vecA ) )
[1] 5015
R> sum( fun( vecA ) )
[1] 5024
R> sum( fun( vecA ) )
[1] 5020
R>
R> sum(vecA)
[1] 66
R> 

You can learn about Rcpp from the vignettes in the package, at my page at
http://dirk.eddelbuettel.com/code/rcpp.html as well different posts on my
blog, and of course the rcpp-devel mailing list.  The example above uses
cxxfunction() from the wonderful inline package you may find useful too as it
compiles, links and loads your C or C++ snippets.

Dirk
#
On 11/18/2011 07:08 AM, Karl Forner wrote:
My thought as well, but actually the deviates need to have mean > .5 so 
you'd do something like

   repeat {
      vecA <- runif(10000)
      if (mean(vecA) > .5) break
   }

You'd do this 1/2 the time, and you'd have to itearte on average 1 / 
(1/2) = 2 times before getting the vector satisfying the constraint, so 
the expected number of iterations is 1/2 * 2 = 1, the same as in the 
original implementation!

It does suggest that there is only one allocation required, if this were 
coded at the C level. But since sum(), mean(), and runif() all go more 
or less directly to C anyway it doesn't seem like this is the right 
problem for a C solution.

Martin

  
    
#
Because if you calculate the probability and then make uniform values,
nothing guarantees that the sum of those uniform values actually is
larger than 50,000. You only have 50% chance it is, in fact...
Cheers
Joris
On Fri, Nov 18, 2011 at 4:08 PM, Karl Forner <karl.forner at gmail.com> wrote:

  
    
1 day later