Skip to content

[Rcpp-devel] Some questions regarding internals of Rcpp objects

5 messages · Ulrich Bodenhofer, Dirk Eddelbuettel, Romain Francois

#
Dear list members,

I have started developing a package using the latest version of Rcpp 
just recently, so I am more or less an Rcpp newbie. Since I am not 
sufficiently proficient in advanced C++ (and admittedly too lazy too) to 
analyze the Rcpp source code, I have some questions regarding what 
actually happens inside Rcpp. Let me bother you with an example:

RcppExport SEXP test(SEXP mat)
{
BEGIN_RCPP
     Rcpp::NumericMatrix matC(mat);
     Rcpp::NumericMatrix matD(mat);

     matC(1, 1) = 1;
     matD(1, 1) = -matD(1, 1);

     return(matC);
END_RCPP
}

If I call this function on some matrix, then the function returns the 
input matrix with its element [2, 2] set to 1, but that's all. The 
change in matD only seems to have an effect on the local copy matD. If I 
replace "return(matC)" by "return(matD)", then the function returns the 
input matrix with the sign of its element [2, 2] inverted, so the change 
in matC has no effect. That Rcpp creates local copies, i.e. that matC 
and matD are independent objects each holding local copies of the 
argument mat, would perfectly explain these results. Previously, 
however, I thought Rcpp would only wrap an R/SEXP object into an Rcpp 
object without making a copy. If it was so, no matter whether the 
function returns matC or matD, the function would return the input 
matrix with -1 on position [2, 2]. Did I get this point wrong? I suppose 
it would save time and memory not to create a copy, in particular, since 
arguments to functions are local copies already. Is there a way to avoid 
copying?

A related question is what the "=" operator does. Does

     Rcpp::NumericVector x = ...;

create a copy or is it only a pointer that is assigned?

My final question concerns the "longevity" of Rcpp objects. Suppose I do 
the following:

     {
         Rcpp::NumericVector vec(10);
         double *p = vec.begin();
     }

Can I use *p outside this block or is the pointer killed along with the 
variable vec after the block is closed? According to my limited 
knowledge, I would assume that it is only the reference vec that is not 
available outside the block, but that the corresponding R object still 
exists and will be deleted by the R garbage collector only after the 
whole function has terminated.

Any help is gratefully appreciated! I am sorry if these are stupid 
questions or if any of them have been discussed previously.

Thanks and best regards,
Ulrich

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20110805/18463a4d/attachment.htm>
#
Hi Ulrich,

These are really several questions.  I'll try to address them all, with some
luck maybe Romain, Doug, Christian, ... will correct or complement as needed.
On 5 August 2011 at 11:45, Ulrich Bodenhofer wrote:
| Dear list members,
| 
| I have started developing a package using the latest version of Rcpp just
| recently, so I am more or less an Rcpp newbie. Since I am not sufficiently
| proficient in advanced C++ (and admittedly too lazy too) to analyze the Rcpp
| source code, I have some questions regarding what actually happens inside Rcpp.
| Let me bother you with an example:
| 
| RcppExport SEXP test(SEXP mat)
| {
| BEGIN_RCPP
|     Rcpp::NumericMatrix matC(mat);
|     Rcpp::NumericMatrix matD(mat);
| 
|     matC(1, 1) = 1;
|     matD(1, 1) = -matD(1, 1);
| 
|     return(matC);
| END_RCPP
| }
| 
| If I call this function on some matrix, then the function returns the input
| matrix with its element [2, 2] set to 1, but that's all. The change in matD
| only seems to have an effect on the local copy matD. If I replace "return(matC)
| " by "return(matD)", then the function returns the input matrix with the sign
| of its element [2, 2] inverted, so the change in matC has no effect. That Rcpp
| creates local copies, i.e. that matC and matD are independent objects each
| holding local copies of the argument mat, would perfectly explain these

This has been discussed here before.  The matrices both receive a _pointer_
(as a SEXP is a pointer) to the original memory of the R-level matrix you
pass in.  

You need to explicitly request deep copies via clone() if you want that.  See
this variant:

R> fun <- cxxfunction(signature(mat="numeric"), plugin="Rcpp", body='
+      Rcpp::NumericMatrix matC(mat);
+      Rcpp::NumericMatrix matD(mat);
+ 
+      matC(1, 1) = 1;
+      matD(1, 1) = -matD(1, 1);
+ 
+      return(Rcpp::List::create(Rcpp::Named("C")=matC,
+                                Rcpp::Named("D")=matD));
+ ');
R> fun(matrix(42,2,2))
$C
     [,1] [,2]
[1,]   42   42
[2,]   42   -1

$D
     [,1] [,2]
[1,]   42   42
[2,]   42   -1

R> 

So C and D are really the same thing.


| results. Previously, however, I thought Rcpp would only wrap an R/SEXP object
| into an Rcpp object without making a copy. If it was so, no matter whether the
| function returns matC or matD, the function would return the input matrix with
| -1 on position [2, 2]. Did I get this point wrong? I suppose it would save time
| and memory not to create a copy, in particular, since arguments to functions
| are local copies already. Is there a way to avoid copying?
| 
| A related question is what the "=" operator does. Does
| 
|     Rcpp::NumericVector x = ...;
| 
| create a copy or is it only a pointer that is assigned?

IIRC the pointer is copied, yet still points to same memory address.
 
| My final question concerns the "longevity" of Rcpp objects. Suppose I do the
| following:
| 
|     {
|         Rcpp::NumericVector vec(10);
|         double *p = vec.begin();
|     }
| 
| Can I use *p outside this block or is the pointer killed along with the
| variable vec after the block is closed? According to my limited knowledge, I

p as a variable is gone outside of this { } scope. So is vec but as you say ...

| would assume that it is only the reference vec that is not available outside
| the block, but that the corresponding R object still exists and will be deleted
| by the R garbage collector only after the whole function has terminated.

... we do not explicitly destroy in the destructor but let R take care of
it. So it will live "some random amount" of time you have no control over. Eeek.

In a nutshell, if you want longer-living objects, assign in R before you call
a function -- you are pretty much guaranteed it will be there.

If you need persistence between different calls from R you need an object
that persists, and there are many possible C++ tricks to do that.
 
| Any help is gratefully appreciated! I am sorry if these are stupid questions or
| if any of them have been discussed previously.

Great questions, but some of the material has been discussed before.  We
probably need a better FAQ or better search engine...

Cheers, Dirk
| 
| Thanks and best regards,
| Ulrich
| 
| 
| ----------------------------------------------------------------------
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
#
Dear Dirk,

Thanks for your detailed comments!

Regarding the first question: as I wrote, I was actually assuming that 
Rcpp would not make deep copies.
Fine. If so, I actually wonder what's wrong with my function

   RcppExport SEXP test(SEXP mat)
   {
   BEGIN_RCPP
       Rcpp::NumericMatrix matC(mat);
       Rcpp::NumericMatrix matD(mat);

       matC(1, 1) = 1;
       matD(1, 1) = -matD(1, 1);

       return(matC);
   END_RCPP
   }

I tried it on matrix(42, 2, 2) too and, surprise, it returns -1 as 
element [2, 2] - just as in your example. So far so good. However, if I 
apply it to matrix(1:16, 4, 4) (and this is an example similar to the 
one I tried before, which actually made me write this question to the 
list), I get the following (note the 1 at [2, 2]):
[,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    1   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16


If it's not the copying, what else is wrong here? I actually don't get 
it. Very mysterious!

Regarding my third question, I have to admit that my example was flawed. 
Stupid me! I wanted to write something like the following:

   double *p;

   {
      Rcpp::NumericVector vec(10);
      p = vec.begin();
   }

As I understand your reply, I could use p outside the block, as it still 
points to the data inside the R object created inside the block, right?

Thanks again for your very helpful answers!

Best regards,
Ulrich
#
Le 06/08/11 06:40, Ulrich Bodenhofer a ?crit :
No mystery.

 > typeof(matrix(1:16, 4, 4))
[1] "integer"

So NumericMatrix has no choice but to coerce your data to numeric, so 
matC and matD are two new objects, different from mat
very dangerous. vec will be garbage collected at some time you don't 
control. don't do that.

  
    
#
Hi Ulrich,
On 6 August 2011 at 06:40, Ulrich Bodenhofer wrote:
| Dear Dirk,
| 
| Thanks for your detailed comments!
| 
| Regarding the first question: as I wrote, I was actually assuming that 
| Rcpp would not make deep copies.

What we now all RcppClassic always copied deeply. That's how it was then.

But for the last few years, the new Rcpp has always keept the original SEXP
without copying --- in other words no deep copy unless you call ::clone().

| > [...]
| > See this variant:
| >
| > R>  fun<- cxxfunction(signature(mat="numeric"), plugin="Rcpp", body='
| > +      Rcpp::NumericMatrix matC(mat);
| > +      Rcpp::NumericMatrix matD(mat);
| > +
| > +      matC(1, 1) = 1;
| > +      matD(1, 1) = -matD(1, 1);
| > +
| > +      return(Rcpp::List::create(Rcpp::Named("C")=matC,
| > +                                Rcpp::Named("D")=matD));
| > + ');
| > R>  fun(matrix(42,2,2))
| > $C
| >       [,1] [,2]
| > [1,]   42   42
| > [2,]   42   -1
| >
| > $D
| >       [,1] [,2]
| > [1,]   42   42
| > [2,]   42   -1
| >
| > R>
| >
| > So C and D are really the same thing.
| >
| Fine. If so, I actually wonder what's wrong with my function
| 
|    RcppExport SEXP test(SEXP mat)
|    {
|    BEGIN_RCPP
|        Rcpp::NumericMatrix matC(mat);
|        Rcpp::NumericMatrix matD(mat);
| 
|        matC(1, 1) = 1;
|        matD(1, 1) = -matD(1, 1);
| 
|        return(matC);
|    END_RCPP
|    }
| 
| I tried it on matrix(42, 2, 2) too and, surprise, it returns -1 as 
| element [2, 2] - just as in your example. So far so good. However, if I 
| apply it to matrix(1:16, 4, 4) (and this is an example similar to the 
| one I tried before, which actually made me write this question to the 
| list), I get the following (note the 1 at [2, 2]):
| 
| > test(matrix(1:16, 4, 4))
|       [,1] [,2] [,3] [,4]
| [1,]    1    5    9   13
| [2,]    2    1   10   14
| [3,]    3    7   11   15
| [4,]    4    8   12   16
| 
| 
| If it's not the copying, what else is wrong here? I actually don't get 
| it. Very mysterious!

It is _the same thing as I just discussed in the emails to Zhongyi. Recall
that you coded a _NumericMatrix_ so if we force a double vector:

R> fun(matrix(seq(1.0, 16.0, by=1.0), 4, 4))
$C
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2   -1   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

$D
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2   -1   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

R> 
 
C and D are once again the same.  Same code as in my previous email.

| Regarding my third question, I have to admit that my example was flawed. 
| Stupid me! I wanted to write something like the following:
| 
|    double *p;
| 
|    {
|       Rcpp::NumericVector vec(10);
|       p = vec.begin();
|    }
| 
| As I understand your reply, I could use p outside the block, as it still 
| points to the data inside the R object created inside the block, right?

Well, maybe, but it is still goes against decades of C and C++ programming
traditions to access something from outside its scope. 

If you want it to last, create it earlier (in R, as a global, ...).  What you
have here seeems very much like undefined behaviour to me.
 
| Thanks again for your very helpful answers!

Pleasure!

Dirk
 
| Best regards,
| Ulrich
| _______________________________________________
| Rcpp-devel mailing list
| Rcpp-devel at lists.r-forge.r-project.org
| https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel