Skip to content

Fill in missing times in a timeseries with NA

2 messages · lglew, Gabor Grothendieck

#
Hi,

I have a irregularly spaced time series dataset, which reads in from a .csv.
I need to convert this to a regularly spaced time series by filling in
missing rows of data with NAs. 

So my data, called NtuMot, looks like this (I've removed some of the
additional rows for simplicity)....
ELEID    date_time                       height                  slope
1        2009-06-24 00:00:00          150                      4.0
1        2009-06-24 01:00:00          175                      4.0 
1        2009-06-24 02:00:00          180                      2.3
1        2009-06-24 03:00:00          200                      1.0
1        2009-06-24 06:00:00          201                      1.0
1        2009-06-24 07:00:00          202                      0.0
1        2009-06-24 08:00:00          202                      0.0
1        2009-06-24 09:00:00          202                      0.0
1        2009-06-24 10:00:00          202                      0.0


I need to end up with this: 
ELEID    date_time                       height                  slope
   
1        2009-06-24 00:00:00         150                      4.0 
1        2009-06-24 01:00:00         175                      4.0
1        2009-06-24 02:00:00         180                      2.3
1        2009-06-24 03:00:00         200                      1.0
1        2009-06-24 04:00:00          NA                       NA
1        2009-06-24 05:00:00          NA                       NA
1        2009-06-24 06:00:00          201                      1.0
1        2009-06-24 07:00:00          202                      0.0
1        2009-06-24 08:00:00          202                      0.0
1        2009-06-24 09:00:00          202                      0.0
1        2009-06-24 10:00:00          202                      0.0

Any ideas much appreciated!

Thanks,

Louise
#
On Wed, Oct 27, 2010 at 8:57 AM, lglew <l.glew at soton.ac.uk> wrote:
This will do it producing a new data frame:

grid.df <- data.frame(date_time = seq(DF[1, 2], DF[nrow(DF), 2], by = "hour"))
merge(DF, grid.df)

however, if you are dealing with irregular series you might find it
more convenient to use the zoo package:

library(zoo)
z <- zoo(DF[-2], DF[, 2])
g <- seq(start(z), end(z), by = "hour")
m <- merge(z, zoo(, g))
m

In the above we used this for data frame DF:

DF <- structure(list(ELEID = c(1, 1, 1, 1, 1, 1, 1, 1, 1),
date_time = structure(c(1245816000,
1245819600, 1245823200, 1245826800, 1245837600, 1245841200, 1245844800,
1245848400, 1245852000), class = c("POSIXt", "POSIXct"), tzone = ""),
    height = c(150, 175, 180, 200, 201, 202, 202, 202, 202),
    slope = c(4, 4, 2.3, 1, 1, 0, 0, 0, 0)), .Names = c("ELEID",
"date_time", "height", "slope"), row.names = c("2009-06-24 00:00:00",
"2009-06-24 01:00:00", "2009-06-24 02:00:00", "2009-06-24 03:00:00",
"2009-06-24 06:00:00", "2009-06-24 07:00:00", "2009-06-24 08:00:00",
"2009-06-24 09:00:00", "2009-06-24 10:00:00"), class = "data.frame")