Calculate daily means from 5-minute interval data

Regardless of whether you use the lower-level split function, or the higher-level aggregate function, or the tidyverse group_by function, the key is learning how to create the column that is the same for all records corresponding to the time interval of interest.

If you convert the sampdate to POSIXct, the tz IS important, because most of us use local timezones that respect daylight savings time, and a naive conversion of standard time will run into trouble if R is assuming daylight savings time applies. The lubridate package gets around this by always assuming UTC and giving you a function to "fix" the timezone after the conversion. I prefer to always be specific about timezones, at least by using so something like

    Sys.setenv( TZ = "Etc/GMT+8" )

which does not respect daylight savings.

Regarding using character data for identifying the month, in order to have clean plots of the data I prefer to use the trunc function but it returns a POSIXlt so I convert it to POSIXct:

    discharge$sampmonthbegin <- as.POSIXct( trunc( discharge$sampdate, units = "months" ) )

Then any of various ways can be used to aggregate the records by that column.

Calculate daily means from 5-minute interval data

Thread (8 messages)