Calculate daily means from 5-minute interval data

x <- rnorm(samples.per.day * 365)
length(x)
[1] 105120

Reshape the fake data into a matrix where each row represents one
24-hour period.

m <- matrix(x, ncol=samples.per.day, byrow=TRUE)
Richard,

Now I understand the need to keep the date and time as a single datetime
column; separately dplyr's sumamrize() provides daily means (too many data
points to plot over 3-5 years). I reformatted the data to provide a
sampledatetime column and a values column.

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be 2016 03
03 12 00 (I've not read how the elements are separated). (The TZ is not
important because all data are either PST or PDT.)
Now we can summarise the rows any way we want.
The basic tool here is ?apply.
?rowMeans is said to be faster than using apply to calculate means,
so we'll use that.  There is no *rowSds so we have to use apply
for the standard deviation.  I use ?head because I don't want to
post tens of thousands of meaningless numbers.
If I create a matrix using the above syntax the resulting rows contain all
recorded values for a specific day. What would be the syntax to collect all
values for each month?

This would result in 12 rows per year; the periods of record for the five
variables availble from that gauge station vary in length.

Regards,

Rich

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be 2016 03
03 12 00 (I've not read how the elements are separated). (The TZ is not
important because all data are either PST or PDT.)
Using this script:
discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep = ',', stringsAsFactors = FALSE)
discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
                                  format = '%Y-%m-%d %H:%M',
                                  optional = 'logical')
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

I get this result:
head(discharge)
sampdate    cfs
1 2016-03-03 12:00:00 149000
2 2016-03-03 12:10:00 150000
3 2016-03-03 12:20:00 151000
4 2016-03-03 12:30:00 156000
5 2016-03-03 12:40:00 154000
6 2016-03-03 12:50:00 150000

I'm completely open to suggestions on using this output to calculate monthly
means and sds.

If dplyr:summarize() will do so please show me how to modify this command:
disc_monthly <- ( discharge
         %>% group_by(sampdate)
         %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
because it produces daily means, not monthly means.

TIA,

Rich
You could use 'split' to create a list of data frames, and then apply a
function to each to get the means and sds.

cols <- "cfs"  # add more as necessary
S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m"))
means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE))
sds   <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd, na.rm =
TRUE)))

On Thu, Sep 2, 2021 at 3:01 PM Rich Shepard <rshepard at appl-ecosys.com>
wrote:
On Thu, 2 Sep 2021, Rich Shepard wrote:

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be 2016
03
03 12 00 (I've not read how the elements are separated). (The TZ is not
important because all data are either PST or PDT.)
Using this script:
discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep =
',', stringsAsFactors = FALSE)
discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
                                  format = '%Y-%m-%d %H:%M',
                                  optional = 'logical')
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

I get this result:
head(discharge)
              sampdate    cfs
1 2016-03-03 12:00:00 149000
2 2016-03-03 12:10:00 150000
3 2016-03-03 12:20:00 151000
4 2016-03-03 12:30:00 156000
5 2016-03-03 12:40:00 154000
6 2016-03-03 12:50:00 150000

I'm completely open to suggestions on using this output to calculate
monthly
means and sds.

If dplyr:summarize() will do so please show me how to modify this command:
disc_monthly <- ( discharge
         %>% group_by(sampdate)
         %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
because it produces daily means, not monthly means.

TIA,

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

You could use 'split' to create a list of data frames, and then apply a
function to each to get the means and sds.

cols <- "cfs"  # add more as necessary
S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m"))
means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE))
sds   <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd, na.rm =
TRUE)))
Andrew,

Thank you for the valuable lesson. This is new to me and I know I'll have
use for it in the future, too.

Much appreciated!

Stay well,

Rich
Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all records corresponding to the time interval of interest.

If you convert the sampdate to POSIXct, the tz IS important, because most of us use local timezones that respect daylight savings time, and a naive conversion of standard time will run into trouble if R is assuming daylight savings time applies. The lubridate package gets around this by always assuming UTC and giving you a function to "fix" the timezone after the conversion. I prefer to always be specific about timezones, at least by using so something like

    Sys.setenv( TZ = "Etc/GMT+8" )

which does not respect daylight savings.

Regarding using character data for identifying the month, in order to have clean plots of the data I prefer to use the trunc function but it returns a POSIXlt so I convert it to POSIXct:

    discharge$sampmonthbegin <- as.POSIXct( trunc( discharge$sampdate, units = "months" ) )

Then any of various ways can be used to aggregate the records by that column.
You could use 'split' to create a list of data frames, and then apply a
function to each to get the means and sds.

cols <- "cfs"  # add more as necessary
S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m"))
means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE))
sds   <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd, na.rm =
TRUE)))

On Thu, Sep 2, 2021 at 3:01 PM Rich Shepard <rshepard at appl-ecosys.com>
wrote:

On Thu, 2 Sep 2021, Rich Shepard wrote:

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be 2016
03
03 12 00 (I've not read how the elements are separated). (The TZ is not
important because all data are either PST or PDT.)
Using this script:
discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep =
',', stringsAsFactors = FALSE)
discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
                                  format = '%Y-%m-%d %H:%M',
                                  optional = 'logical')
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

I get this result:
head(discharge)
              sampdate    cfs
1 2016-03-03 12:00:00 149000
2 2016-03-03 12:10:00 150000
3 2016-03-03 12:20:00 151000
4 2016-03-03 12:30:00 156000
5 2016-03-03 12:40:00 154000
6 2016-03-03 12:50:00 150000

I'm completely open to suggestions on using this output to calculate
monthly
means and sds.

If dplyr:summarize() will do so please show me how to modify this command:
disc_monthly <- ( discharge
         %>% group_by(sampdate)
         %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
because it produces daily means, not monthly means.

TIA,

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Sent from my phone. Please excuse my brevity.

Regardless of whether you use the lower-level split function, or the
higher-level aggregate function, or the tidyverse group_by function, the
key is learning how to create the column that is the same for all records
corresponding to the time interval of interest.
Jeff,

I tried responding to only you but my message bounced:

<jdnewmil at dcn.davis.ca.us>: host
     d9300a.ess.barracudanetworks.com[209.222.82.252] said: 550 permanent
     failure for one or more recipients (jdnewmil at dcn.davis.ca.us:blocked) (in
     reply to end of DATA command)

My response was not pertininet to the entire list, IMO, so I sent it to your
address.

Rich
What is the best way to read (from a text file) timestamps from the fall
time change, where there are two 1:15am's?  E.g., here is an extract from a
US Geological Survey web site giving data on the river through our county
on 2020-11-01, when we changed from PDT to PST,
https://nwis.waterdata.usgs.gov/wa/nwis/uv/?cb_00010=on&cb_00060=on&cb_00065=on&format=rdb&site_no=12200500&period=&begin_date=2020-11-01&end_date=2020-11-05
.

The timestamps include the date and time as well as PDT or PST.

river <-
c("datetime,tz,discharge,height,temp",
  "2020-11-01 00:00,PDT,20500,16.44,9.3",
  "2020-11-01 00:15,PDT,20500,16.44,9.3",
  "2020-11-01 00:30,PDT,20500,16.43,9.3",
  "2020-11-01 00:45,PDT,20400,16.40,9.3",
  "2020-11-01 01:00,PDT,20400,16.40,9.3",
  "2020-11-01 01:00,PST,20200,16.34,9.2",
  "2020-11-01 01:15,PDT,20400,16.39,9.3",
  "2020-11-01 01:15,PST,20200,16.34,9.2",
  "2020-11-01 01:30,PDT,20300,16.37,9.2",
  "2020-11-01 01:30,PST,20100,16.31,9.2",
  "2020-11-01 01:45,PDT,20300,16.35,9.2",
  "2020-11-01 01:45,PST,20100,16.29,9.2",
  "2020-11-01 02:00,PST,20100,16.29,9.2",
  "2020-11-01 02:15,PST,20000,16.27,9.1",
  "2020-11-01 02:30,PST,20000,16.26,9.1"
  )
d <- read.table(text=river, sep=",",header=TRUE)

The entries are obviously not in time order.

Is there a simple way to read the timedate and tz columns together?  One
way is to use d$tz to construct an offset that can be read with
strptime's "%z".
d$POSIXct <-
as.POSIXct(paste(d$datetime,ifelse(d$tz=="PDT","-0700","-0800")),
format="%Y-%m-%d %H:%M %z")
d
datetime  tz discharge height temp             POSIXct
1  2020-11-01 00:00 PDT     20500  16.44  9.3 2020-11-01 00:00:00
2  2020-11-01 00:15 PDT     20500  16.44  9.3 2020-11-01 00:15:00
3  2020-11-01 00:30 PDT     20500  16.43  9.3 2020-11-01 00:30:00
4  2020-11-01 00:45 PDT     20400  16.40  9.3 2020-11-01 00:45:00
5  2020-11-01 01:00 PDT     20400  16.40  9.3 2020-11-01 01:00:00
6  2020-11-01 01:00 PST     20200  16.34  9.2 2020-11-01 01:00:00
7  2020-11-01 01:15 PDT     20400  16.39  9.3 2020-11-01 01:15:00
8  2020-11-01 01:15 PST     20200  16.34  9.2 2020-11-01 01:15:00
9  2020-11-01 01:30 PDT     20300  16.37  9.2 2020-11-01 01:30:00
10 2020-11-01 01:30 PST     20100  16.31  9.2 2020-11-01 01:30:00
11 2020-11-01 01:45 PDT     20300  16.35  9.2 2020-11-01 01:45:00
12 2020-11-01 01:45 PST     20100  16.29  9.2 2020-11-01 01:45:00
13 2020-11-01 02:00 PST     20100  16.29  9.2 2020-11-01 02:00:00
14 2020-11-01 02:15 PST     20000  16.27  9.1 2020-11-01 02:15:00
15 2020-11-01 02:30 PST     20000  16.26  9.1 2020-11-01 02:30:00
with(d[order(d$POSIXct),], plot(temp)) # monotonic temperature
-Bill

On Thu, Sep 2, 2021 at 12:41 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
wrote:
Regardless of whether you use the lower-level split function, or the
higher-level aggregate function, or the tidyverse group_by function, the
key is learning how to create the column that is the same for all records
corresponding to the time interval of interest.

If you convert the sampdate to POSIXct, the tz IS important, because most
of us use local timezones that respect daylight savings time, and a naive
conversion of standard time will run into trouble if R is assuming daylight
savings time applies. The lubridate package gets around this by always
assuming UTC and giving you a function to "fix" the timezone after the
conversion. I prefer to always be specific about timezones, at least by
using so something like

    Sys.setenv( TZ = "Etc/GMT+8" )

which does not respect daylight savings.

Regarding using character data for identifying the month, in order to have
clean plots of the data I prefer to use the trunc function but it returns a
POSIXlt so I convert it to POSIXct:

    discharge$sampmonthbegin <- as.POSIXct( trunc( discharge$sampdate,
units = "months" ) )

Then any of various ways can be used to aggregate the records by that
column.

On September 2, 2021 12:10:15 PM PDT, Andrew Simmons <akwsimmo at gmail.com>
wrote:
You could use 'split' to create a list of data frames, and then apply a
function to each to get the means and sds.

cols <- "cfs"  # add more as necessary
S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m"))
means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE))
sds   <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd, na.rm =
TRUE)))

On Thu, Sep 2, 2021 at 3:01 PM Rich Shepard <rshepard at appl-ecosys.com>
wrote:

On Thu, 2 Sep 2021, Rich Shepard wrote:

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be
2016
03
03 12 00 (I've not read how the elements are separated). (The TZ is
not
important because all data are either PST or PDT.)
Using this script:
discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep
=
',', stringsAsFactors = FALSE)
discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
                                  format = '%Y-%m-%d %H:%M',
                                  optional = 'logical')
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

I get this result:
head(discharge)
              sampdate    cfs
1 2016-03-03 12:00:00 149000
2 2016-03-03 12:10:00 150000
3 2016-03-03 12:20:00 151000
4 2016-03-03 12:30:00 156000
5 2016-03-03 12:40:00 154000
6 2016-03-03 12:50:00 150000

I'm completely open to suggestions on using this output to calculate
monthly
means and sds.

If dplyr:summarize() will do so please show me how to modify this
command:
disc_monthly <- ( discharge
         %>% group_by(sampdate)
         %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
because it produces daily means, not monthly means.

TIA,

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

      [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

This problem nearly always boils down to using meta knowledge about the file. Having informal TZ info in the file is very helpful, but PST is not necessarily a uniquely-defined time zone specification so you have to draw on information outside of the file to know that these codes correspond to -0800 etc. (e.g. CST could be China Standard Time or US Central Standard Time.) Thus, it is tough to make this into a broadly-useful function.

You can also construct the timezone column from knowledge about the location of interest and the monotonicity of the time data. https://jdnewmil.github.io/eci298sp2016/QuickHowtos1.html#handling-time-data ... but the answer to "easy" seems firmly in the eyes of the beholder.
What is the best way to read (from a text file) timestamps from the fall
time change, where there are two 1:15am's?  E.g., here is an extract from a
US Geological Survey web site giving data on the river through our county
on 2020-11-01, when we changed from PDT to PST,
https://nwis.waterdata.usgs.gov/wa/nwis/uv/?cb_00010=on&cb_00060=on&cb_00065=on&format=rdb&site_no=12200500&period=&begin_date=2020-11-01&end_date=2020-11-05
.

The timestamps include the date and time as well as PDT or PST.

river <-
c("datetime,tz,discharge,height,temp",
 "2020-11-01 00:00,PDT,20500,16.44,9.3",
 "2020-11-01 00:15,PDT,20500,16.44,9.3",
 "2020-11-01 00:30,PDT,20500,16.43,9.3",
 "2020-11-01 00:45,PDT,20400,16.40,9.3",
 "2020-11-01 01:00,PDT,20400,16.40,9.3",
 "2020-11-01 01:00,PST,20200,16.34,9.2",
 "2020-11-01 01:15,PDT,20400,16.39,9.3",
 "2020-11-01 01:15,PST,20200,16.34,9.2",
 "2020-11-01 01:30,PDT,20300,16.37,9.2",
 "2020-11-01 01:30,PST,20100,16.31,9.2",
 "2020-11-01 01:45,PDT,20300,16.35,9.2",
 "2020-11-01 01:45,PST,20100,16.29,9.2",
 "2020-11-01 02:00,PST,20100,16.29,9.2",
 "2020-11-01 02:15,PST,20000,16.27,9.1",
 "2020-11-01 02:30,PST,20000,16.26,9.1"
 )
d <- read.table(text=river, sep=",",header=TRUE)

The entries are obviously not in time order.

Is there a simple way to read the timedate and tz columns together?  One
way is to use d$tz to construct an offset that can be read with
strptime's "%z".

d$POSIXct <-
as.POSIXct(paste(d$datetime,ifelse(d$tz=="PDT","-0700","-0800")),
format="%Y-%m-%d %H:%M %z")
d
          datetime  tz discharge height temp             POSIXct
1  2020-11-01 00:00 PDT     20500  16.44  9.3 2020-11-01 00:00:00
2  2020-11-01 00:15 PDT     20500  16.44  9.3 2020-11-01 00:15:00
3  2020-11-01 00:30 PDT     20500  16.43  9.3 2020-11-01 00:30:00
4  2020-11-01 00:45 PDT     20400  16.40  9.3 2020-11-01 00:45:00
5  2020-11-01 01:00 PDT     20400  16.40  9.3 2020-11-01 01:00:00
6  2020-11-01 01:00 PST     20200  16.34  9.2 2020-11-01 01:00:00
7  2020-11-01 01:15 PDT     20400  16.39  9.3 2020-11-01 01:15:00
8  2020-11-01 01:15 PST     20200  16.34  9.2 2020-11-01 01:15:00
9  2020-11-01 01:30 PDT     20300  16.37  9.2 2020-11-01 01:30:00
10 2020-11-01 01:30 PST     20100  16.31  9.2 2020-11-01 01:30:00
11 2020-11-01 01:45 PDT     20300  16.35  9.2 2020-11-01 01:45:00
12 2020-11-01 01:45 PST     20100  16.29  9.2 2020-11-01 01:45:00
13 2020-11-01 02:00 PST     20100  16.29  9.2 2020-11-01 02:00:00
14 2020-11-01 02:15 PST     20000  16.27  9.1 2020-11-01 02:15:00
15 2020-11-01 02:30 PST     20000  16.26  9.1 2020-11-01 02:30:00
with(d[order(d$POSIXct),], plot(temp)) # monotonic temperature
-Bill

On Thu, Sep 2, 2021 at 12:41 PM Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
wrote:

Regardless of whether you use the lower-level split function, or the
higher-level aggregate function, or the tidyverse group_by function, the
key is learning how to create the column that is the same for all records
corresponding to the time interval of interest.

If you convert the sampdate to POSIXct, the tz IS important, because most
of us use local timezones that respect daylight savings time, and a naive
conversion of standard time will run into trouble if R is assuming daylight
savings time applies. The lubridate package gets around this by always
assuming UTC and giving you a function to "fix" the timezone after the
conversion. I prefer to always be specific about timezones, at least by
using so something like

    Sys.setenv( TZ = "Etc/GMT+8" )

which does not respect daylight savings.

Regarding using character data for identifying the month, in order to have
clean plots of the data I prefer to use the trunc function but it returns a
POSIXlt so I convert it to POSIXct:

    discharge$sampmonthbegin <- as.POSIXct( trunc( discharge$sampdate,
units = "months" ) )

Then any of various ways can be used to aggregate the records by that
column.

On September 2, 2021 12:10:15 PM PDT, Andrew Simmons <akwsimmo at gmail.com>
wrote:
You could use 'split' to create a list of data frames, and then apply a
function to each to get the means and sds.

cols <- "cfs"  # add more as necessary
S <- split(discharge[cols], format(discharge$sampdate, format = "%Y-%m"))
means <- do.call("rbind", lapply(S, colMeans, na.rm = TRUE))
sds   <- do.call("rbind", lapply(S, function(xx) sapply(xx, sd, na.rm =
TRUE)))

On Thu, Sep 2, 2021 at 3:01 PM Rich Shepard <rshepard at appl-ecosys.com>
wrote:

On Thu, 2 Sep 2021, Rich Shepard wrote:

If I correctly understand the output of as.POSIXlt each date and time
element is separate, so input such as 2016-03-03 12:00 would now be
2016
03
03 12 00 (I've not read how the elements are separated). (The TZ is
not
important because all data are either PST or PDT.)
Using this script:
discharge <- read.csv('../data/water/discharge.dat', header = TRUE, sep
=
',', stringsAsFactors = FALSE)
discharge$sampdate <- as.POSIXlt(discharge$sampdate, tz = "",
                                  format = '%Y-%m-%d %H:%M',
                                  optional = 'logical')
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

I get this result:
head(discharge)
              sampdate    cfs
1 2016-03-03 12:00:00 149000
2 2016-03-03 12:10:00 150000
3 2016-03-03 12:20:00 151000
4 2016-03-03 12:30:00 156000
5 2016-03-03 12:40:00 154000
6 2016-03-03 12:50:00 150000

I'm completely open to suggestions on using this output to calculate
monthly
means and sds.

If dplyr:summarize() will do so please show me how to modify this
command:
disc_monthly <- ( discharge
         %>% group_by(sampdate)
         %>% summarize(exp_value = mean(cfs, na.rm = TRUE))
because it produces daily means, not monthly means.

TIA,

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

      [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Sent from my phone. Please excuse my brevity.

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Sent from my phone. Please excuse my brevity.