Skip to content

cumsum on chron objects

5 messages · Gabor Grothendieck, Sébastien Bihorel

#
Hi,

Is there some alternative to cumsum for chron objects? I have data frames
that contain some chron objects that look like this:

DateTime
13/10/03 12:30:35
NA
NA
NA
15/10/03 16:30:05
NA
NA
...


and I've been trying to replace the NA's so that a date/time sequence is
created starting with the preceding available value. Because the number of
rows with NA's following each available date/time is unknown, I've split
the data frame using:

splitdf <- split(df, as.factor(df$DateTime))

so that I can later use lapply to work on each "block" of data. I thought
I could use cumsum and set the NA's to the desired interval to create the
date/time sequence starting with the first row. However, this function is
not defined for chron objects. Does anybody know of alternatives to create
such a sequence?

Thanks in advance,
#
On 5/17/05, Sebastian Luque <sluque at mun.ca> wrote:
The 'zoo' package has na.locf which stands for Last Occurrence Carried
Forward, which is what I believe you want.   

First let us create some test data, x:
[1] (01/02/70 12:00:00) (01/03/70 00:00:00) (NA NA)            
[4] (NA NA)             (01/05/70 00:00:00) (NA NA)
[1] (01/02/70 12:00:00) (01/03/70 00:00:00) (01/03/70 00:00:00)
[4] (01/03/70 00:00:00) (01/05/70 00:00:00) (01/05/70 00:00:00)
#
On 5/17/05, Gabor Grothendieck <ggrothendieck at gmail.com> wrote:
Just to reply to my own post, it can actually be done even more
simply:

chron(na.locf(as.vector(x)))

Also in re-reading my post, I think the O in locf stands for observation 
rather than occurrence.
#
Hello Gabor,

Thanks for your reply. na.locf would replace the NA's with the most recent
non-NA, so it wouldn't create a sequence of chron dates/times (via
as.vector, as in your example). To expand my original example:
[...]
I thought one could replace the NA's by the desired interval, say 1 day,
so if the above chron object was named nachron, one could do:

nachron[is.na(nachron)] <- 1

and, for simplicity, applying on each "block" separately:

cumsum(nachron)

would give:

DateTime
13/10/03 12:30:35
14/10/03 12:30:35
15/10/03 12:30:35
16/10/03 12:30:35

for the first "block", and:

DateTime
15/10/03 16:30:05
16/10/03 16:30:05
17/10/03 16:30:05
...

for the second one. Since there are not too many blocks I may end up doing
it in Excel, but it would be nice to know how to do it in R!

Cheers and thank you,
#
On 5/17/05, Sebastian Luque <sluque at mun.ca> wrote:
I did not understand that you wanted a sequence.

If x and x.locf are as in the previous response then:

   my.seq <- function(x) seq(from = x[1], len = length(x))
   chron(unlist(tapply(x, x.locf, my.seq)))

or if you want to use cumsum:

   xx <- as.vector(x); xx[is.na(xx)] <- 1
   chron(unlist(tapply(xx, x.locf, cumsum)))