Dear all,
I'm new to Rcpp and this mailing list. I did look for a previous answer to
this question, but it's hard to summarise succinctly so I may have missed
something. Apologies if so.
I'm defining a custom class, an object of which will need to survive across
various calls back and forth between R and C++, so I plan to use the XPtr
class to wrap a pointer. My question is, what are the advantages and
disadvantages of using Rcpp vector classes (vs std::vector) for member
variables? To be more concrete, I mean
class Foo
{
private:
Rcpp::NumericVector bar;
}
vs
class Foo
{
private;
std::vector<double> bar;
}
Are there garbage collection issues when these live inside an XPtr<Foo>?
Are there speed advantages of std::vector<double> over Rcpp::NumericVector
for general use? Any input would be welcome. Thanks in advance.
Great work on Rcpp, by the way. I've been hearing very good things for
quite some time, but wasn't sure if it was worth dusting off my slightly
rusty C++ for. Suffice to say I think it was. The API is very clean and
returning to the standard R API will be painful...!
All the best,
Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20131008/ee06533b/attachment.html>
[Rcpp-devel] Rcpp vector classes vs std::vector for custom class member variables
7 messages · Dirk Eddelbuettel, Romain Francois, Jon Clayden
Hi Jon,
On 8 October 2013 at 12:04, Jon Clayden wrote:
| I'm new to Rcpp and this mailing list. I did look for a previous answer to this
| question, but it's hard to summarise succinctly so I may have missed something.
| Apologies if so.
|
| I'm defining a custom class, an object of which will need to survive across
| various calls back and forth between R and C++, so I plan to use the XPtr class
| to wrap a pointer. My question is, what are the advantages and disadvantages of
| using Rcpp vector classes (vs std::vector) for member variables? To be more
| concrete, I mean
|
| class Foo
| {
| private:
| ? Rcpp::NumericVector bar;
| }
|
| vs
|
| class Foo
| {
| private;
| ? std::vector<double> bar;
| }
Here is the first choice: std::vector<double> vs Rcpp::NumericVector.
Generally speaking I would use the former when I know I will interface other
C++ code requiring that interface. I use the latter if I need to 'just' pass
things back and forth and maybe use my own (locally added) routines. [ And
you can go pretty cheaply from Rcpp::NumericVector to std::vector. ]
I use a third (arma::vec and its matrices) when I do linear algebra. (And
could also use RcppEigen for different vectors).
| Are there garbage collection issues when these live inside an XPtr<Foo>?
XPtr means R does not touch it. There will never be a gc. So XPtr makes less
sense with R classes -- you want to be 'away from R' already for (large)
objects so I would tend to use XPtr of std::vector. Or maybe even XPtr of
your class Foo. Or use Jay and Mike's bigmemory (which uses an external
pointer internally too). It all depends.
| Are there speed advantages of std::vector<double> over Rcpp::NumericVector
| for general use? Any input would be welcome. Thanks in advance.
I would profile rather than believing what random stranger on the Internet
tell you :) But ex ante there should not be a large difference. Returning
from std::vector to R may involve a copy -- not sure.
| Great work on Rcpp, by the way. I've been hearing very good things for quite
| some time, but wasn't sure if it was worth dusting off my slightly rusty C++
| for. Suffice to say I think it was. The API is very clean and returning to the
| standard R API will be painful...!
Thanks! Glad you are finding it useful.
Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
Le 08/10/13 13:04, Jon Clayden a ?crit :
Dear all,
I'm new to Rcpp and this mailing list. I did look for a previous answer
to this question, but it's hard to summarise succinctly so I may have
missed something. Apologies if so.
I'm defining a custom class, an object of which will need to survive
across various calls back and forth between R and C++, so I plan to use
the XPtr class to wrap a pointer. My question is, what are the
advantages and disadvantages of using Rcpp vector classes (vs
std::vector) for member variables? To be more concrete, I mean
class Foo
{
private:
Rcpp::NumericVector bar;
}
vs
class Foo
{
private;
std::vector<double> bar;
}
Are there garbage collection issues when these live inside an XPtr<Foo>?
No. An XPtr<Foo> will delete the object it points to, using Foo's destructor when it goes out of scope. I would argue against using external pointers directly when you can use modules and experience more type safety than with direct external pointers. But these (using external pointers and using modules) only make sense when you want to be able to hold a reference to your object at the R level, do you ?
Are there speed advantages of std::vector<double> over Rcpp::NumericVector for general use? Any input would be welcome. Thanks in advance.
This is premature optimization. What you want to ask yourself is what
are you going to do with "bar". If bar goes back and forth between the
C++ and the R side, then NumericVector is your best candidate.
If bar is something internal to your class, then std::vector<> is fine
and will give you a more complete interface, will grow efficiently, etc ...
If you really want to have the best performing class for your
application, you need to measure it.
It is easy enough to make Foo a template and switch between the two in
benchmarking:
template <typename Container>
class Foo {
private:
Container bar ;
} ;
Foo< std::vector<double> > f1;
Foo< Rcpp::NumericVector > f2;
Great work on Rcpp, by the way. I've been hearing very good things for quite some time, but wasn't sure if it was worth dusting off my slightly rusty C++ for. Suffice to say I think it was. The API is very clean and returning to the standard R API will be painful...!
Great. You don't need expert knowledge of C++ for Rcpp to be useful.
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
Thanks Dirk and Romain for your helpful replies. To follow up briefly... I'm defining a custom class, an object of which will need to survive
across various calls back and forth between R and C++, so I plan to use
the XPtr class to wrap a pointer. My question is, what are the
advantages and disadvantages of using Rcpp vector classes (vs
std::vector) for member variables? To be more concrete, I mean
class Foo
{
private:
Rcpp::NumericVector bar;
}
vs
class Foo
{
private;
std::vector<double> bar;
}
Are there garbage collection issues when these live inside an XPtr<Foo>?
No. An XPtr<Foo> will delete the object it points to, using Foo's destructor when it goes out of scope.
Sure. And the memory allocated to "bar" (if it's a NumericVector) will be protected from the garbage collector until the Foo object is deleted?
I would argue against using external pointers directly when you can use modules and experience more type safety than with direct external pointers. But these (using external pointers and using modules) only make sense when you want to be able to hold a reference to your object at the R level, do you ?
I think I do... ;) I need the object to hold state and not be deallocated between calls into the C++ code. I also want to allow for the possibility that multiple Foo objects exist, and are being operated on simultaneously. So holding a handle on the R side and passing it back to each C++ function that works with the object seems like the natural approach to me.
Are there speed advantages of std::vector<double> over
Rcpp::NumericVector for general use? Any input would be welcome. Thanks in advance.
This is premature optimization. What you want to ask yourself is what are you going to do with "bar". If bar goes back and forth between the C++ and the R side, then NumericVector is your best candidate.
Agreed, but it's a consideration. I fully accept that the choice depends on the particular application, as you and Dirk both said. I was just wondering what the baseline performance difference (if any) might be.
If bar is something internal to your class, then std::vector<> is fine and
will give you a more complete interface, will grow efficiently, etc ...
If you really want to have the best performing class for your application,
you need to measure it.
It is easy enough to make Foo a template and switch between the two in
benchmarking:
template <typename Container>
class Foo {
private:
Container bar ;
} ;
Foo< std::vector<double> > f1;
Foo< Rcpp::NumericVector > f2;
Great work on Rcpp, by the way. I've been hearing very good things for
quite some time, but wasn't sure if it was worth dusting off my slightly rusty C++ for. Suffice to say I think it was. The API is very clean and returning to the standard R API will be painful...!
Great. You don't need expert knowledge of C++ for Rcpp to be useful.
Sure, but I'm doing quite a bit in native code so there's little point in dropping competent C code for badly-written C++. Yes, I could mix them, but it's nice to be able to make the most of the tools available... :) Regards, Jon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20131008/7586284a/attachment.html>
Le 08/10/13 14:53, Jon Clayden a ?crit :
Thanks Dirk and Romain for your helpful replies. To follow up briefly...
I'm defining a custom class, an object of which will need to survive
across various calls back and forth between R and C++, so I plan
to use
the XPtr class to wrap a pointer. My question is, what are the
advantages and disadvantages of using Rcpp vector classes (vs
std::vector) for member variables? To be more concrete, I mean
class Foo
{
private:
Rcpp::NumericVector bar;
}
vs
class Foo
{
private;
std::vector<double> bar;
}
Are there garbage collection issues when these live inside an
XPtr<Foo>?
No. An XPtr<Foo> will delete the object it points to, using Foo's
destructor when it goes out of scope.
Sure. And the memory allocated to "bar" (if it's a NumericVector) will
be protected from the garbage collector until the Foo object is deleted?
yes
I would argue against using external pointers directly when you can
use modules and experience more type safety than with direct
external pointers.
But these (using external pointers and using modules) only make
sense when you want to be able to hold a reference to your object at
the R level, do you ?
I think I do... ;) I need the object to hold state and not be
deallocated between calls into the C++ code. I also want to allow for
the possibility that multiple Foo objects exist, and are being operated
on simultaneously. So holding a handle on the R side and passing it back
to each C++ function that works with the object seems like the natural
approach to me.
I'd strongly advise to consider using modules as the vessel for that sort of things. This way, on the R side, you have something concrete instead of something opaque. R does not know what is inside an external pointer, but a module object, you have access to its fields, methods, etc ...
Are there speed advantages of std::vector<double> over
Rcpp::NumericVector for general use? Any input would be welcome.
Thanks
in advance.
This is premature optimization. What you want to ask yourself is
what are you going to do with "bar". If bar goes back and forth
between the C++ and the R side, then NumericVector is your best
candidate.
Agreed, but it's a consideration. I fully accept that the choice depends
on the particular application, as you and Dirk both said. I was just
wondering what the baseline performance difference (if any) might be.
There is no such answer. It really depends on what you do with bar
If bar is something internal to your class, then std::vector<> is
fine and will give you a more complete interface, will grow
efficiently, etc ...
If you really want to have the best performing class for your
application, you need to measure it.
It is easy enough to make Foo a template and switch between the two
in benchmarking:
template <typename Container>
class Foo {
private:
Container bar ;
} ;
Foo< std::vector<double> > f1;
Foo< Rcpp::NumericVector > f2;
Great work on Rcpp, by the way. I've been hearing very good
things for
quite some time, but wasn't sure if it was worth dusting off my
slightly
rusty C++ for. Suffice to say I think it was. The API is very
clean and
returning to the standard R API will be painful...!
Great. You don't need expert knowledge of C++ for Rcpp to be useful.
Sure, but I'm doing quite a bit in native code so there's little point
in dropping competent C code for badly-written C++. Yes, I could mix
them, but it's nice to be able to make the most of the tools available... :)
Sure. Refactoring existing C code into C++ is kind of hard, but writing new C++ code instead of new C code is easier. At least it is from my perspective.
Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30
On 8 October 2013 at 13:53, Jon Clayden wrote:
| Sure. And the memory allocated to "bar" (if it's a NumericVector) will be | protected from the garbage collector until the Foo object is deleted? Yes, R objects created by Rcpp, eg Rcpp::NumericVector and all the other Rcpp::* objects mapping to standard R types, are indistinguishable from native R objects and behave the same at the R level as objects created by R. That is essentially the whole point. | I think I do... ;) ?I need the object to hold state and not be deallocated | between calls into the C++ code. I also want to allow for the possibility that | multiple Foo objects exist, and are being operated on simultaneously. So | holding a handle on the R side and passing it back to each C++ function that | works with the object seems like the natural approach to me. A really simple way to do that is that create a container class that has this type as an object, and to create an init function, an accessor function, ... Rcpp modules can do that for you too just via declarations, resulting in Reference Class objects at the R level. This is a little more advanced; maybe more suitable for your next project, or now if you are willing to read up (Rcpp modules vignette, corresponding chapter in the Rcpp book, existing packages, ...) | Agreed, but it's a consideration. I fully accept that the choice depends on the | particular application, as you and Dirk both said. I was just wondering what | the baseline performance difference (if any) might be. Nobody knows ex ante. There is no explicit slowdown baked in. As we have suggested several times, you need to measure it! | Sure, but I'm doing quite a bit in native code so there's little point in | dropping competent C code for badly-written C++. Yes, I could mix them, but | it's nice to be able to make the most of the tools available... :) You can do whatever you want and how you want to do it. You can write as much K&R C code as you like. Rcpp is here to get your data more seamlessly from R to C++ and back again, and a lot else. If you then prefer to cast back to C, go for it. Not my style, but heck, choice is good. It even lets people do silly things like work in C ;-) Dirk
Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com
On 8 October 2013 14:19, Dirk Eddelbuettel <edd at debian.org> wrote:
On 8 October 2013 at 13:53, Jon Clayden wrote: | I think I do... ;) I need the object to hold state and not be deallocated | between calls into the C++ code. I also want to allow for the possibility that | multiple Foo objects exist, and are being operated on simultaneously. So | holding a handle on the R side and passing it back to each C++ function that | works with the object seems like the natural approach to me. A really simple way to do that is that create a container class that has this type as an object, and to create an init function, an accessor function, ... Rcpp modules can do that for you too just via declarations, resulting in Reference Class objects at the R level. This is a little more advanced; maybe more suitable for your next project, or now if you are willing to read up (Rcpp modules vignette, corresponding chapter in the Rcpp book, existing packages, ...)
Right, thanks. I will read the modules vignette for a start.
| Agreed, but it's a consideration. I fully accept that the choice depends on the | particular application, as you and Dirk both said. I was just wondering what | the baseline performance difference (if any) might be. Nobody knows ex ante. There is no explicit slowdown baked in. As we have suggested several times, you need to measure it!
Yes, OK - poor phrasing on my part. I take the point. Jon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20131008/fd3dd39e/attachment.html>