Message-ID: <85f3856f0905200554s457edfb7x374d7fb37e385a3f@mail.gmail.com>
Date: 2009-05-20T12:54:28Z
From: Brigid Mooney
Subject: efficiency when processing ordered data frames
Hoping for a little insight into how to make sure I have R running as
efficiently as possible.
Suppose I have a data frame, A, with n rows and m columns, where col1
is a date time stamp. Also suppose that when this data is imported
(from a csv or SQL), that the data is already sorted such that the
time stamp in col1 is in ascending (or descending) order.
If I then wanted to select only the rows of A where col1 <= a certain
time, I am wondering if R has to read through the entirety of col1 to
select those rows (all n of them). Is it possible for R to recognize
(or somehow be told) that these rows are already in order, thus
allowing the computation could be completed in ~log(n) row reads
instead?
Thanks!