Skip to content

[Rcpp-devel] Efficient DataFrame access by row & column

22 messages · John Merrill, Kevin Ushey, Dirk Eddelbuettel +3 more

#
Hi,

I have a need to loop through all the entries of a DataFrame by row, then column.  I know two different ways:

  // Case A: When df.length() is unknown at coding time:
  int n = df.nrows();
  int m = df.length();
  for(int i=1; i<n; i++) {
    for(int j=0; j<m; j++) {
      NumericVector v = df[j];
      // ... do stuff with v[i] ...
    }
  }

  // Case B: If I know the number of columns while writing the C code:
  int n = df.nrows();
  NumericVector xs = df[0];
  NumericVector ys = df[1];
  for(int i=1; i<n; i++) {
    // ... do stuff with xs[i] and ys[i] ...
  }

The second way is less flexible, but it's also quite a bit faster in practice - I presume this means the "NumericVector ..." expressions are doing a non-trivial amount of work (perhaps even copying the whole vector?).

Is there a way to have my cake & eat it?  Can I efficiently (O[1]) index into a DataFrame by numeric row index and numeric column index?

I'm also curious why it's a syntax error in Case A to just write `df[j][i]` or even `((NumericVector) df[j])[i]`  - clearly there's magic behind the "NumericVector" call that I don't understand.

Thanks.

--
Ken Williams, Senior Research Scientist
WindLogics
http://windlogics.com


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130219/88de3a39/attachment.html>
#
Ken,
On 19 February 2013 at 22:35, Ken Williams wrote:
| I have a need to loop through all the entries of a DataFrame by row, then
| column.  I know two different ways:

There have been prior discussions of this topic, as well as example posts --
even leading to a Rcpp Gallery article. Did you read any of these?  It wasn't
clear from your post.

| I?m also curious why it?s a syntax error in Case A to just write `df[j][i]` or

Eeeek.  I prefer the more C++-y way of writing df(j,i).  Square brackets only
work for vectors, and even then you may be better off with x(i) for
consistency.

Overall, your premise may be wrong too.  "We all know" that a data.frame is
not the fastest data structure in R, so by forcing ourselves to the same
access are we not handycapping ourselves.

Once you are in C++, you can use whatever C++ datatype you like.  A
data.frame really is just a list of vectors, each of the vectors has eg a
begin(0 iterator which you can (fairly costlessly) instantiate STL types.

And those give you performance guarantees.

Hope this helps,  Dirk
#
The most inefficient part I see is the creation of a new NumericVector inside the inner most loop. You copied each column n times, of which n-1 times are unnecessary.

Yan Zhou
On Feb 19, 2013, at 11:02 PM, Dirk Eddelbuettel <edd at debian.org> wrote:

            
#
That's what I suspected.  So is there no way to access elements of a DataFrame without copying columns to a new vector first?

 -Ken

________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
I'm a little puzzled by your question.  Could you use a reference instead
of instantiating a new copy?


On Tue, Feb 19, 2013 at 3:19 PM, Ken Williams
<Ken.Williams at windlogics.com>wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130219/3b675b58/attachment.html>
#
I looked, but I didn't find anything directly addressing it.  Most of what I found at http://search.gmane.org/?query=dataframe&group=gmane.comp.lang.r.rcpp seems to deal with creating DataFrame objects, not indexing into them.

In the Rcpp Gallery, I also see 2 articles on creating/modifying DataFrame objects, but nothing demonstrating any indexing differently than I wrote.

The other place I looked was inst/unitTests/cpp/DataFrame.cpp in the repository.

If I missed something relevant, I'd be happy to be pointed to it.
Attempting to do so, I get a compile-time error:

window.cpp:68:34: error: ambiguous overload for 'operator-' in 'Rcpp::Vector<RTYPE>::operator()(const size_t&, const size_t&) [with int RTYPE = 19, Rcpp::Vector<RTYPE>::Proxy = Rcpp::internal::generic_proxy<19>, size_t = long long unsigned int]((* &((size_t)j)), (* &((size_t)i))) - Rcpp::Vector<RTYPE>::operator()(const size_t&, const size_t&) [with int RTYPE = 19, Rcpp::Vector<RTYPE>::Proxy = Rcpp::internal::generic_proxy<19>, size_t = long long unsigned int]((* &((size_t)j)), (* &((size_t)last_i)))'
window.cpp:68:34: note: candidates are:
window.cpp:68:34: note: operator-(SEXP, SEXP) <built-in>
window.cpp:68:34: note: operator-(SEXP, long long int) <built-in>
window.cpp:68:34: note: operator-(int, int) <built-in>

For context, the line that's failing is:

     if(fabs(df(j,i)-df(j,last_i))>thresh) {
I was operating under the premise that there "must be" a constant-time accessor for a List element (DataFrame column), and once I have that, a constant-time accessor for an element of that vector.  I know the latter is true, but is the former not true?  I assumed it was but that I just couldn't find it.

 -Ken

________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
I would love to use a reference, but I don't know how.  That's in fact the essence of my question. =)

Is there already some example code somewhere showing how to get reference to a DataFrame column without copying?  I must be just missing it.

 -Ken


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
Ken,
On 19 February 2013 at 23:24, Ken Williams wrote:
| > From: Dirk Eddelbuettel [mailto:edd at debian.org]
| > Eeeek.  I prefer the more C++-y way of writing df(j,i).
| 
| Attempting to do so, I get a compile-time error:
| 
| window.cpp:68:34: error: ambiguous overload for 'operator-' in 'Rcpp::Vector<RTYPE>::operator()(const size_t&, const size_t&) [with int RTYPE = 19, Rcpp::Vector<RTYPE>::Proxy = Rcpp::internal::generic_proxy<19>, size_t = long long unsigned int]((* &((size_t)j)), (* &((size_t)i))) - Rcpp::Vector<RTYPE>::operator()(const size_t&, const size_t&) [with int RTYPE = 19, Rcpp::Vector<RTYPE>::Proxy = Rcpp::internal::generic_proxy<19>, size_t = long long unsigned int]((* &((size_t)j)), (* &((size_t)last_i)))'
| window.cpp:68:34: note: candidates are:
| window.cpp:68:34: note: operator-(SEXP, SEXP) <built-in>
| window.cpp:68:34: note: operator-(SEXP, long long int) <built-in>
| window.cpp:68:34: note: operator-(int, int) <built-in>
| 
| For context, the line that's failing is:
| 
|      if(fabs(df(j,i)-df(j,last_i))>thresh) {

You are relying on an implicit conversion here to make df(.,.) a double. The
compiler tells you that it is trying sugar operators.  Sugar is good to have
in Rcpp, but occassionally there is a cost. This may be one of those times.

Try

	double a = df(j,i);
	double b = df(j,last_i);
        if (fabs(a - b) > thresh) {

and see if that works.  

You may want to look into using Armadillo vectors and matrices (or Eigen's,
we are equal opportunity here).  I had good luck with Armadilllo.

| I was operating under the premise that there "must be" a constant-time accessor for a List element (DataFrame column), and once I have that, a constant-time accessor for an element of that vector.  I know the latter is true, but is the former not true?  I assumed it was but that I just couldn't find it.

Sadly, "must be" is a little too optimistic.  

Not every conceivable operation is implemented in Rcpp, but we are always
open to patches to make it more complete.

Hope this helps,  Dirk
#
Unfortunately no, I get a similar error when I try that:

window.cpp:68:24: error: conversion from 'Rcpp::Vector<19>::Proxy {aka Rcpp::internal::generic_proxy<19>}' to 'double' is ambiguous
window.cpp:68:24: note: candidates are:
C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:273:4: note: Rcpp::internal::generic_proxy<RTYPE>::operator int() const [with int RTYPE = 19]
C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:272:4: note: Rcpp::internal::generic_proxy<RTYPE>::operator bool() const [with int RTYPE = 19]
C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:267:26: note: Rcpp::internal::generic_proxy<RTYPE>::operator U() const [with U = double, int RTYPE = 19]
Of course. =)
If I can get myself up to speed with my understanding, I'll be happy to contribute what I can.

 -Ken

________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
Well, here's a snippet from a much larger routine I used deep inside an
implementation of kd-trees:

  for (int i = 0; i < instances_df.size(); ++i) {
    const NumericVector& data_column = instances_df[i];
    for (int j = 0; j < training_instances.size(); ++j) {
      // Argument order changes here...
      instances[j][i] = data_column[training_instances[j]];
    }
  }

To set expectations, training_instances can be very large indeed (ca. 1M).
  The code is quite fast.

(And sorry, Dirk -- yes, I really do have an access of the form x[i][j].
 Mea culpa, etc.)


On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams
<Ken.Williams at windlogics.com>wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130219/307f227a/attachment-0001.html>
#
Another thing worth thinking about: perhaps the easiest way to side-step
the issue is to work with a NumericMatrix rather than a DataFrame. At
least, from the example you gave, it sounds like a container where you have
the expectation that each column is a NumericVector of equal length.

If you can make the switch to NumericMatrix, then you can generate and
operate with row/column views, e.g. NumericMatrix::Row and
NumericMatrix::Column, which will generate references to rows / columns and
hence avoid copying. (These are generated whenever you do e.g. x(i, _) or
x(_, i) on a NumericMatrix x).

-Kevin

On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams
<Ken.Williams at windlogics.com>wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130219/cc2f2178/attachment.html>
#
Thanks - that does seem to work, but it doesn't perform as well as the pre-copying version.

Here's  a Gist so the conversation can be more concrete:

  https://gist.github.com/kenahoo/4991485

For me, the countSteps() version is about 10 times faster than countSteps2().

 -Ken

From: John Merrill [mailto:john.merrill at gmail.com]
Sent: Tuesday, February 19, 2013 5:46 PM
To: Ken Williams
Cc: Yan Zhou; Dirk Eddelbuettel; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

Well, here's a snippet from a much larger routine I used deep inside an implementation of kd-trees:
  for (int i = 0; i < instances_df.size(); ++i) {
    const NumericVector& data_column = instances_df[i];
    for (int j = 0; j < training_instances.size(); ++j) {
      // Argument order changes here...
      instances[j][i] = data_column[training_instances[j]];
    }
  }

To set expectations, training_instances can be very large indeed (ca. 1M).   The code is quite fast.

(And sorry, Dirk -- yes, I really do have an access of the form x[i][j].  Mea culpa, etc.)
On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams <Ken.Williams at windlogics.com<mailto:Ken.Williams at windlogics.com>> wrote:

            
I would love to use a reference, but I don't know how.  That's in fact the essence of my question. =)

Is there already some example code somewhere showing how to get reference to a DataFrame column without copying?  I must be just missing it.

 -Ken


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130220/86419733/attachment.html>
#
That may be a good option if I'm willing to make at least one copy, right at the beginning, to create the NumericMatrix.  I'd prefer to even avoid that copy if possible, but if it must be, it must be.

It performs somewhere between the other two options:

  https://gist.github.com/kenahoo/4991485

-Ken


From: Kevin Ushey [mailto:kevinushey at gmail.com]
Sent: Tuesday, February 19, 2013 6:07 PM
To: Ken Williams
Cc: John Merrill; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

Another thing worth thinking about: perhaps the easiest way to side-step the issue is to work with a NumericMatrix rather than a DataFrame. At least, from the example you gave, it sounds like a container where you have the expectation that each column is a NumericVector of equal length.

If you can make the switch to NumericMatrix, then you can generate and operate with row/column views, e.g. NumericMatrix::Row and NumericMatrix::Column, which will generate references to rows / columns and hence avoid copying. (These are generated whenever you do e.g. x(i, _) or x(_, i) on a NumericMatrix x).

-Kevin
On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams <Ken.Williams at windlogics.com<mailto:Ken.Williams at windlogics.com>> wrote:

            
I would love to use a reference, but I don't know how.  That's in fact the essence of my question. =)

Is there already some example code somewhere showing how to get reference to a DataFrame column without copying?  I must be just missing it.

 -Ken


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org<mailto:Rcpp-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130220/c15aa432/attachment-0001.html>
#
I made one more alternative, countSteps4(), and it seems to do pretty well:

  https://gist.github.com/kenahoo/4991485

Curious that it performs much better than the NumericMatrix version.  Does anyone have insight why?

-Ken

From: rcpp-devel-bounces at lists.r-forge.r-project.org [mailto:rcpp-devel-bounces at lists.r-forge.r-project.org] On Behalf Of Ken Williams
Sent: Tuesday, February 19, 2013 6:24 PM
To: Kevin Ushey
Cc: rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

That may be a good option if I'm willing to make at least one copy, right at the beginning, to create the NumericMatrix.  I'd prefer to even avoid that copy if possible, but if it must be, it must be.

It performs somewhere between the other two options:

  https://gist.github.com/kenahoo/4991485

-Ken


From: Kevin Ushey [mailto:kevinushey at gmail.com]
Sent: Tuesday, February 19, 2013 6:07 PM
To: Ken Williams
Cc: John Merrill; rcpp-devel at lists.r-forge.r-project.org<mailto:rcpp-devel at lists.r-forge.r-project.org>
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

Another thing worth thinking about: perhaps the easiest way to side-step the issue is to work with a NumericMatrix rather than a DataFrame. At least, from the example you gave, it sounds like a container where you have the expectation that each column is a NumericVector of equal length.

If you can make the switch to NumericMatrix, then you can generate and operate with row/column views, e.g. NumericMatrix::Row and NumericMatrix::Column, which will generate references to rows / columns and hence avoid copying. (These are generated whenever you do e.g. x(i, _) or x(_, i) on a NumericMatrix x).

-Kevin
On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams <Ken.Williams at windlogics.com<mailto:Ken.Williams at windlogics.com>> wrote:

            
I would love to use a reference, but I don't know how.  That's in fact the essence of my question. =)

Is there already some example code somewhere showing how to get reference to a DataFrame column without copying?  I must be just missing it.

 -Ken


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
_______________________________________________
Rcpp-devel mailing list
Rcpp-devel at lists.r-forge.r-project.org<mailto:Rcpp-devel at lists.r-forge.r-project.org>
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130220/a8599df3/attachment.html>
#
On 19 February 2013 at 23:46, Ken Williams wrote:
| > From: Dirk Eddelbuettel [mailto:edd at debian.org]
| > You are relying on an implicit conversion here to make df(.,.) a double. The
| > compiler tells you that it is trying sugar operators.  Sugar is good to have in
| > Rcpp, but occassionally there is a cost. This may be one of those times.
| >
| > Try
| >
| >       double a = df(j,i);
| >       double b = df(j,last_i);
| >         if (fabs(a - b) > thresh) {
| >
| > and see if that works.
| 
| Unfortunately no, I get a similar error when I try that:
| 
| window.cpp:68:24: error: conversion from 'Rcpp::Vector<19>::Proxy {aka Rcpp::internal::generic_proxy<19>}' to 'double' is ambiguous
| window.cpp:68:24: note: candidates are:
| C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:273:4: note: Rcpp::internal::generic_proxy<RTYPE>::operator int() const [with int RTYPE = 19]
| C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:272:4: note: Rcpp::internal::generic_proxy<RTYPE>::operator bool() const [with int RTYPE = 19]
| C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:267:26: note: Rcpp::internal::generic_proxy<RTYPE>::operator U() const [with U = double, int RTYPE = 19]

Maybe try std::fabs() instead of fabs() ?

That is one of the reasons I do not like a global 'using namespace Rcpp' in
my code.  Explicit namespaces are, well, explicit.

Dirk
| 
| 
| >
| > Not every conceivable operation is implemented in Rcpp,
| 
| Of course. =)
| 
| > but we are always
| > open to patches to make it more complete.
| 
| If I can get myself up to speed with my understanding, I'll be happy to contribute what I can.
| 
|  -Ken
| 
| ________________________________
| 
| CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
Hi Dirk,

It is actually a problem with Rcpp, though unrelated to Ken's original question anymore, below is an example reproduce the problem.

#include <iostream>

class Wrapper
{
    public :

    Wrapper (double x) : x_(x) {}

    template <typename T> operator T () const {return x_;}
    operator int () const {return x_;}
    operator bool () const {return x_;}

    private :

    double x_;
};

int main ()
{
    Wrapper x(1.2);
    double a = x; // ERROR

    std::cout << a << std::endl;
}

The error is quite compiler dependent. Clang and G++ considers three operators, template one with T = double, operator int, operator bool and conclude it is ambiguous. Intel icpc does not consider the template one and also consider it is ambiguous for obvious reason. Why the template one is not a perfect match is still a mysterious to me right now.
On Feb 20, 2013, at 1:11 AM, Dirk Eddelbuettel <edd at debian.org> wrote:

            
#
On 20 February 2013 at 01:15, Yan Zhou wrote:
| Hi Dirk,
| 
| It is actually a problem with Rcpp, though unrelated to Ken's original question anymore, below is an example reproduce the problem.
| 
| #include <iostream>
| 
| class Wrapper
| {
|     public :
| 
|     Wrapper (double x) : x_(x) {}
| 
|     template <typename T> operator T () const {return x_;}
|     operator int () const {return x_;}
|     operator bool () const {return x_;}
| 
|     private :
| 
|     double x_;
| };
| 
| int main ()
| {
|     Wrapper x(1.2);
|     double a = x; // ERROR
| 
|     std::cout << a << std::endl;
| }
| 
| The error is quite compiler dependent. Clang and G++ considers three operators, template one with T = double, operator int, operator bool and conclude it is ambiguous. Intel icpc does not consider the template one and also consider it is ambiguous for obvious reason. Why the template one is not a perfect match is still a mysterious to me right now.

I see nothing Rcpp-specific here.  It's more about how to avoid ambiguity
within C++.  Might make a good question for the C++ tag on StackOverflow.

Dirk
| On Feb 20, 2013, at 1:11 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| 
| >
| > On 19 February 2013 at 23:46, Ken Williams wrote:
| > | > From: Dirk Eddelbuettel [mailto:edd at debian.org]
| > | > You are relying on an implicit conversion here to make df(.,.) a double. The
| > | > compiler tells you that it is trying sugar operators.  Sugar is good to have in
| > | > Rcpp, but occassionally there is a cost. This may be one of those times.
| > | >
| > | > Try
| > | >
| > | >       double a = df(j,i);
| > | >       double b = df(j,last_i);
| > | >         if (fabs(a - b) > thresh) {
| > | >
| > | > and see if that works.
| > | 
| > | Unfortunately no, I get a similar error when I try that:
| > | 
| > | window.cpp:68:24: error: conversion from 'Rcpp::Vector<19>::Proxy {aka Rcpp::internal::generic_proxy<19>}' to 'double' is ambiguous
| > | window.cpp:68:24: note: candidates are:
| > | C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:273:4: note: Rcpp::internal::generic_proxy<RTYPE>::operator int() const [with int RTYPE = 19]
| > | C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:272:4: note: Rcpp::internal::generic_proxy<RTYPE>::operator bool() const [with int RTYPE = 19]
| > | C:/mybin/R/R-2.15.2/library/Rcpp/include/Rcpp/vector/string_proxy.h:267:26: note: Rcpp::internal::generic_proxy<RTYPE>::operator U() const [with U = double, int RTYPE = 19]
| > 
| > Maybe try std::fabs() instead of fabs() ?
| > 
| > That is one of the reasons I do not like a global 'using namespace Rcpp' in
| > my code.  Explicit namespaces are, well, explicit.
| > 
| > Dirk
| > | 
| > | 
| > | >
| > | > Not every conceivable operation is implemented in Rcpp,
| > | 
| > | Of course. =)
| > | 
| > | > but we are always
| > | > open to patches to make it more complete.
| > | 
| > | If I can get myself up to speed with my understanding, I'll be happy to contribute what I can.
| > | 
| > |  -Ken
| > | 
| > | ________________________________
| > | 
| > | CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
| > 
| > -- 
| > Dirk Eddelbuettel | edd at debian.org | http://dirk.eddelbuettel.com  
| > _______________________________________________
| > Rcpp-devel mailing list
| > Rcpp-devel at lists.r-forge.r-project.org
| > https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/rcpp-devel
|
#
On Feb 20, 2013, at 1:46 AM, Dirk Eddelbuettel <edd at debian.org> wrote:

            
Yes, exactly. The problem itself is nothing Rcpp specific but the operator implementation was copied directly from Rcpp's proxy header. I will ask it on Stack later.
#
On 20 February 2013 at 01:48, Yan Zhou wrote:
|
| On Feb 20, 2013, at 1:46 AM, Dirk Eddelbuettel <edd at debian.org> wrote:
| 
| >
| > On 20 February 2013 at 01:15, Yan Zhou wrote:
| > | Hi Dirk,
| > 
| > I see nothing Rcpp-specific here.  It's more about how to avoid ambiguity
| > within C++.  Might make a good question for the C++ tag on StackOverflow.
| Yes, exactly. The problem itself is nothing Rcpp specific but the operator implementation was copied directly from Rcpp's proxy header. I will ask it on Stack later.

Great, drop a pointer here when it is up. I try to keep up with R or at least
Rcpp questions there and don't regularly poke into the C++ tag anymore.

Dirk
#
Le 20/02/13 00:09, Yan Zhou a ?crit :
No. This does not copy data. This uses time to protect it, etc ... but 
the data in the vector is not copied.

  
    
#
Excellent, then my strategy of pre-populating a NumericVector[] won't bloat memory.  Thanks for confirming.

 -Ken



________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.
#
I have posted a question on stack about the issue, a long with some more details on the problem

http://stackoverflow.com/questions/14987326/overloading-conversion-operator-template

Also I saw in the source of Rcpp, there is a comment says that the operator int and operator bool are provided to help compilers. I am curious in what situations that a single operator 
template is not enough. Since the operator actually simply call Rcpp::as

I have some thoughts on this issue. At the heart of the problem is that the compilers do not consider the operator template as a perfect match. The only reason I can think of is that if the operator template is chosen, then some conversion of its argument is required, therefore it is no better than the operator int and operator bool. However, according to C++ rule, in the case of conversion operator, template argument deduction use the return type and the input argument is the *this, which is not converted.

Best,

Yan Zhou
On Feb 20, 2013, at 2:11 AM, Dirk Eddelbuettel <edd at debian.org> wrote: