efficiency when processing ordered data frames

Wed, May 20, 2009 5:54 AM #

Hoping for a little insight into how to make sure I have R running as
efficiently as possible.

Suppose I have a data frame, A, with n rows and m columns, where col1
is a date time stamp.  Also suppose that when this data is imported
(from a csv or SQL), that the data is already sorted such that the
time stamp in col1 is in ascending (or descending) order.

If I then wanted to select only the rows of A where col1 <= a certain
time, I am wondering if R has to read through the entirety of col1 to
select those rows (all n of them).  Is it possible for R to recognize
(or somehow be told) that these rows are already in order, thus
allowing the computation could be completed in ~log(n) row reads
instead?

Thanks!

jim holtman

Wed, May 20, 2009 6:27 AM #

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20090520/ddaef003/attachment-0001.pl>