Skip to content

setting zeros for the missing interval in data

3 messages · Eliza Botto, Jim Lemon, Avi Gross

#
Dear useRs,

I have the following dataset which represents rainfall data at a 5-minute interval from 1 May 2021 to 30 September 2021.
structure(list(?..CHANNEL = c(30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L, 30L,
30L, 30L), YEAR = c(2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L, 2021L,
2021L, 2021L, 2021L, 2021L), TIMESTAMP = c("2021/05/02 10:00:00 PM",
"2021/05/02 10:55:00 PM", "2021/05/04 05:40:00 PM", "2021/05/04 06:50:00 PM",
"2021/05/05 03:05:00 AM", "2021/05/08 05:15:00 AM", "2021/05/08 05:20:00 AM",
"2021/05/08 05:30:00 AM", "2021/05/08 05:50:00 AM", "2021/05/08 06:05:00 AM",
"2021/05/08 07:15:00 AM", "2021/05/08 08:00:00 AM", "2021/05/08 08:05:00 AM",
"2021/05/08 08:15:00 AM", "2021/05/08 08:35:00 AM", "2021/05/08 08:50:00 AM",
"2021/05/08 09:05:00 AM", "2021/05/08 09:30:00 AM", "2021/05/08 09:45:00 AM",
"2021/05/08 09:55:00 AM", "2021/05/08 10:10:00 AM", "2021/05/08 10:20:00 AM",
"2021/05/08 10:40:00 AM", "2021/05/08 10:55:00 AM", "2021/05/08 11:15:00 AM",
"2021/05/08 11:25:00 AM", "2021/05/08 11:35:00 AM", "2021/05/08 11:45:00 AM",
"2021/05/08 11:50:00 AM", "2021/05/08 12:00:00 PM", "2021/05/08 12:05:00 PM",
"2021/05/08 12:15:00 PM", "2021/05/08 12:20:00 PM", "2021/05/08 12:30:00 PM",
"2021/05/08 12:35:00 PM", "2021/05/08 12:50:00 PM", "2021/05/08 01:35:00 PM",
"2021/05/08 01:50:00 PM", "2021/05/08 02:20:00 PM", "2021/05/08 02:30:00 PM",
"2021/05/08 02:35:00 PM", "2021/05/08 03:00:00 PM", "2021/05/08 03:35:00 PM",
"2021/05/08 03:45:00 PM", "2021/05/08 04:30:00 PM", "2021/05/08 04:40:00 PM",
"2021/05/08 04:55:00 PM", "2021/05/08 05:05:00 PM", "2021/05/08 05:20:00 PM",
"2021/05/08 07:25:00 PM", "2021/05/08 09:00:00 PM", "2021/05/08 09:25:00 PM",
"2021/05/08 09:50:00 PM", "2021/05/08 10:15:00 PM", "2021/05/08 10:40:00 PM",
"2021/05/08 11:35:00 PM", "2021/05/09 12:40:00 AM", "2021/05/09 01:10:00 AM",
"2021/05/09 02:10:00 AM", "2021/05/09 06:00:00 AM", "2021/05/09 02:40:00 PM",
"2021/05/09 02:45:00 PM", "2021/05/09 02:50:00 PM", "2021/05/09 02:55:00 PM",
"2021/05/09 03:00:00 PM", "2021/05/09 03:05:00 PM", "2021/05/09 03:10:00 PM",
"2021/05/09 03:15:00 PM", "2021/05/09 03:20:00 PM", "2021/05/09 03:25:00 PM",
"2021/05/09 03:30:00 PM", "2021/05/09 03:35:00 PM", "2021/05/09 03:40:00 PM",
"2021/05/09 03:45:00 PM", "2021/05/09 03:50:00 PM", "2021/05/09 03:55:00 PM",
"2021/05/09 04:00:00 PM", "2021/05/09 04:05:00 PM", "2021/05/09 04:10:00 PM",
"2021/05/09 04:15:00 PM", "2021/05/09 04:25:00 PM", "2021/05/09 04:30:00 PM",
"2021/05/09 04:35:00 PM", "2021/05/09 04:40:00 PM", "2021/05/09 04:45:00 PM",
"2021/05/09 04:50:00 PM", "2021/05/09 05:00:00 PM", "2021/05/09 05:05:00 PM",
"2021/05/09 05:10:00 PM", "2021/05/09 05:20:00 PM", "2021/05/09 05:25:00 PM",
"2021/05/09 05:35:00 PM", "2021/05/09 05:45:00 PM", "2021/05/09 05:50:00 PM",
"2021/05/09 06:00:00 PM", "2021/05/09 06:10:00 PM", "2021/05/09 06:20:00 PM",
"2021/05/09 06:30:00 PM", "2021/05/09 06:40:00 PM", "2021/05/09 06:50:00 PM"
), RAINFALL = c(0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2,
0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2
)), row.names = c(276L, 286L, 599L, 773L, 829L, 951L, 955L, 971L,
996L, 1014L, 1123L, 1242L, 1260L, 1301L, 1378L, 1422L, 1456L,
1487L, 1504L, 1515L, 1539L, 1557L, 1597L, 1629L, 1679L, 1708L,
1728L, 1757L, 1775L, 1803L, 1818L, 1846L, 1859L, 1882L, 1892L,
1917L, 1983L, 2007L, 2050L, 2066L, 2077L, 2124L, 2190L, 2207L,
2288L, 2309L, 2334L, 2351L, 2374L, 2518L, 2588L, 2600L, 2616L,
2627L, 2639L, 2655L, 2674L, 2684L, 2725L, 2967L, 3826L, 3830L,
3832L, 3838L, 3842L, 3845L, 3846L, 3851L, 3854L, 3856L, 3861L,
3865L, 3868L, 3871L, 3873L, 3877L, 3880L, 3881L, 3885L, 3888L,
3890L, 3893L, 3897L, 3899L, 3900L, 3902L, 3906L, 3907L, 3910L,
3914L, 3915L, 3917L, 3920L, 3922L, 3923L, 3926L, 3928L, 3931L,
3932L, 3933L), class = "data.frame")

You could clearly see that there are some intervals which are missing from this dataset. For example, the data values for 1st of May are missing. Similarly,

between

30 2021 2021/05/02 10:00:00 PM      0.2

and

30 2021 2021/05/02 10:55:00 PM      0.2

the values of rainfall depth for following "time stamps" are missing because they were "zero"

30 2021 2021/05/02 10:05:00 PM      0.0

30 2021 2021/05/02 10:10:00 PM      0.0

30 2021 2021/05/02 10:15:00 PM      0.0

30 2021 2021/05/02 10:20:00 PM      0.0

30 2021 2021/05/02 10:25:00 PM      0.0

30 2021 2021/05/02 10:30:00 PM      0.0

30 2021 2021/05/02 10:35:00 PM      0.0

30 2021 2021/05/02 10:40:00 PM      0.0

30 2021 2021/05/02 10:45:00 PM      0.0

30 2021 2021/05/02 10:50:00 PM      0.0

So, what I want is a uniform list starting from 2021/05/01 to 2021/09/30 at every 5-minute intervals with "zero" values for the missing intervals in the original data list. I hope my question is clear.

Thank You very much in advance,

Eliza



[https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>     Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
#
Hi Eliza,
Your data wouldn't read for me, so I had to do some quick hacking to
get something to work with. It's a bit long, so I've attached what may
be a solution as an R source file.

Jim
On Tue, Mar 1, 2022 at 2:47 PM Eliza Botto <eliza_botto at outlook.com> wrote:
#
Hi Jim,

Just FYI, your reply to Eliza likely included your attachment but the copy to this list has it stripped.

I suspect the world would just work better if forums like this moved from text-only email to something like groups.io that allow richer text and attachments, albeit they do make images fuzzier to save on space. 

Regardless,

Avi


-----Original Message-----
From: Jim Lemon <drjimlemon at gmail.com>
To: Eliza Botto <eliza_botto at outlook.com>
Cc: R-help at r-project.org <R-help at r-project.org>
Sent: Mon, Feb 28, 2022 11:35 pm
Subject: Re: [R] setting zeros for the missing interval in data


Hi Eliza,
Your data wouldn't read for me, so I had to do some quick hacking to
get something to work with. It's a bit long, so I've attached what may
be a solution as an R source file.

Jim
On Tue, Mar 1, 2022 at 2:47 PM Eliza Botto <eliza_botto at outlook.com> wrote:
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.