Skip to content
Prev 5318 / 10988 Next

[Rcpp-devel] Efficient DataFrame access by row & column

Thanks - that does seem to work, but it doesn't perform as well as the pre-copying version.

Here's  a Gist so the conversation can be more concrete:

  https://gist.github.com/kenahoo/4991485

For me, the countSteps() version is about 10 times faster than countSteps2().

 -Ken

From: John Merrill [mailto:john.merrill at gmail.com]
Sent: Tuesday, February 19, 2013 5:46 PM
To: Ken Williams
Cc: Yan Zhou; Dirk Eddelbuettel; rcpp-devel at lists.r-forge.r-project.org
Subject: Re: [Rcpp-devel] Efficient DataFrame access by row & column

Well, here's a snippet from a much larger routine I used deep inside an implementation of kd-trees:
  for (int i = 0; i < instances_df.size(); ++i) {
    const NumericVector& data_column = instances_df[i];
    for (int j = 0; j < training_instances.size(); ++j) {
      // Argument order changes here...
      instances[j][i] = data_column[training_instances[j]];
    }
  }

To set expectations, training_instances can be very large indeed (ca. 1M).   The code is quite fast.

(And sorry, Dirk -- yes, I really do have an access of the form x[i][j].  Mea culpa, etc.)
On Tue, Feb 19, 2013 at 3:26 PM, Ken Williams <Ken.Williams at windlogics.com<mailto:Ken.Williams at windlogics.com>> wrote:

            
I would love to use a reference, but I don't know how.  That's in fact the essence of my question. =)

Is there already some example code somewhere showing how to get reference to a DataFrame column without copying?  I must be just missing it.

 -Ken


________________________________

CONFIDENTIALITY NOTICE: This e-mail message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution of any kind is strictly prohibited. If you are not the intended recipient, please contact the sender via reply e-mail and destroy all copies of the original message. Thank you.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.r-forge.r-project.org/pipermail/rcpp-devel/attachments/20130220/86419733/attachment.html>

Thread (22 messages)

Ken Williams Efficient DataFrame access by row & column Feb 19 Dirk Eddelbuettel Efficient DataFrame access by row & column Feb 19 Yan Zhou Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 John Merrill Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 Dirk Eddelbuettel Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 John Merrill Efficient DataFrame access by row & column Feb 19 Kevin Ushey Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 Ken Williams Efficient DataFrame access by row & column Feb 19 Dirk Eddelbuettel Efficient DataFrame access by row & column Feb 19 Yan Zhou Efficient DataFrame access by row & column Feb 19 Dirk Eddelbuettel Efficient DataFrame access by row & column Feb 19 Yan Zhou Efficient DataFrame access by row & column Feb 19 Dirk Eddelbuettel Efficient DataFrame access by row & column Feb 19 Romain Francois Efficient DataFrame access by row & column Feb 20 Ken Williams Efficient DataFrame access by row & column Feb 20 Yan Zhou Conversion operator for Rcpp::internal::generic_proxy (Previously Efficient DataFrame access by row & column) Feb 20