There are quite a variety of approaches implemented in various contributed packages, but here is a base R approach based on merge: Sys.setenv(TZ = "UTC" ) # or other non-DST zone unless you need it YY$TIMESTAMP <- as.POSIXct( YY$TIMESTAMP, format = "%Y/%m/%d %I:%M:%S %p" ) tlims <- as.POSIXct( c( "2021-05-02", "2021-05-10" ) ) tdiff <- as.difftime( 5, units="mins" ) aa <- seq( tlims[1], tlims[2], by = tdiff ) AA <- expand.grid( CHANNEL = 30, TIMESTAMP = aa ) yy <- merge( AA, YY[ , c( "CHANNEL", "TIMESTAMP", "RAINFALL" ) ], by = c( "CHANNEL", "TIMESTAMP" ), all.x = TRUE ) yy$RAINFALL[ is.na( yy$RAINFALL ) ] <- 0 yy
On February 28, 2022 7:52:47 PM PST, Eliza Botto <eliza_botto at outlook.com> wrote:
[The data setting in the last email might be faulty] Dear useRs, I have the following dataset which represents rainfall data at a 5-minute interval from 1 May 2021 to 30 September 2021.
dput(YY)
structure(list(CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM",
"2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM",
"2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM",
"2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM",
"2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM",
"2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM",
"2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM",
"2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM",
"2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM",
"2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM",
"2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM",
"2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM",
"2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM",
"2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM",
"2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM",
"2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM",
"2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM",
"2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM",
"2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM",
"2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM",
"2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM",
"2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM",
"2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM",
"2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM",
"2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM",
"2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM",
"2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM",
"2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM",
"2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM",
"2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM",
"2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM",
"2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM",
"2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM",
"2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM"
), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2
)), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L,
996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L,
1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L,
1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L,
1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L,
2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L,
2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L,
3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L,
3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L,
3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L,
3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L,
3932L, 3933L), class = "data.frame")
You could clearly see that there are some intervals which are missing from this dataset. For example, the data values for 1st of May are missing. Similarly,
between
30 2021 2021/05/02 10:00:00 PM 0.2
and
30 2021 2021/05/02 10:55:00 PM 0.2
the values of rainfall depth for following "time stamps" are missing because they were "zero"
30 2021 2021/05/02 10:05:00 PM 0.0
30 2021 2021/05/02 10:10:00 PM 0.0
30 2021 2021/05/02 10:15:00 PM 0.0
30 2021 2021/05/02 10:20:00 PM 0.0
30 2021 2021/05/02 10:25:00 PM 0.0
30 2021 2021/05/02 10:30:00 PM 0.0
30 2021 2021/05/02 10:35:00 PM 0.0
30 2021 2021/05/02 10:40:00 PM 0.0
30 2021 2021/05/02 10:45:00 PM 0.0
30 2021 2021/05/02 10:50:00 PM 0.0
So, what I want is a uniform list starting from 2021/05/01 to 2021/09/30 at every 5-minute intervals with "zero" values for the missing intervals in the original data list. I hope my question is clear.
Thank You very much in advance,
Eliza
[https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Sent from my phone. Please excuse my brevity.