Skip to content
Prev 5225 / 15274 Next

Discretising intra-day data -- how to get by with less memory?

The three functions that can be found in xts to help here are:

(1) align.time:  (as Brian alluded to)
This will simply shift all times to the next n-th second specified.
e.g. align.time(x, n=300)  # 5 minutes

(2) endpoints:
Locate the last time-stamp (obs in time-series) for each "k" "on" periods
e.g. endpoints(x, on="minutes", k=5)  # 5 minutes

(3) merge.xts with a regular time index.
e.g. merge(x, xts(, timeBasedSeq('2009-11-01 08:30/2009-11-01 03:00')))



A complete example:
[,1]
2009-11-27 08:48:18    9
2009-11-27 08:51:03    7
2009-11-27 08:52:13    8
2009-11-27 08:53:10   10
2009-11-27 08:55:25    6
2009-11-27 08:55:56    1
2009-11-27 08:56:02    4
2009-11-27 08:56:44    3
2009-11-27 08:59:24    2
2009-11-27 09:02:46    5
[,1]
2009-11-27 08:49:00    9
2009-11-27 08:52:00    7
2009-11-27 08:53:00    8
2009-11-27 08:54:00   10
2009-11-27 08:56:00    6
2009-11-27 08:56:00    1
2009-11-27 08:57:00    4
2009-11-27 08:57:00    3
2009-11-27 09:00:00    2
2009-11-27 09:03:00    5
[,1]
2009-11-27 08:49:00    9
2009-11-27 08:52:00    7
2009-11-27 08:53:00    8
2009-11-27 08:54:00   10
2009-11-27 08:56:00    1
2009-11-27 08:57:00    3
2009-11-27 09:00:00    2
2009-11-27 09:03:00    5
xa.endpoints.xa...minutes...
2009-11-27 08:49:00                            9
2009-11-27 08:50:00                           NA
2009-11-27 08:51:00                           NA
2009-11-27 08:52:00                            7
2009-11-27 08:53:00                            8
2009-11-27 08:54:00                           10
2009-11-27 08:55:00                           NA
2009-11-27 08:56:00                            1
2009-11-27 08:57:00                            3
2009-11-27 08:58:00                           NA
2009-11-27 08:59:00                           NA
2009-11-27 09:00:00                            2
2009-11-27 09:01:00                           NA
2009-11-27 09:02:00                           NA
2009-11-27 09:03:00                            5
xa.endpoints.xa...minutes...
2009-11-27 08:49:00                            9
2009-11-27 08:50:00                            9
2009-11-27 08:51:00                            9
2009-11-27 08:52:00                            7
2009-11-27 08:53:00                            8
2009-11-27 08:54:00                           10
2009-11-27 08:55:00                           10
2009-11-27 08:56:00                            1
2009-11-27 08:57:00                            3
2009-11-27 08:58:00                            3
2009-11-27 08:59:00                            3
2009-11-27 09:00:00                            2
2009-11-27 09:01:00                            2
2009-11-27 09:02:00                            2
2009-11-27 09:03:00                            5


I didn't test against your solution(s), but this should be very fast
and use as little memory as possible.  endpoints, align.time and
merge.xts have all been heavily optimized for speed and memory.

HTH
Jeff
On Fri, Nov 27, 2009 at 7:00 AM, Brian G. Peterson <brian at braverock.com> wrote: