Calculate daily means from 5-minute interval data
On Mon, 30 Aug 2021, Richard O'Keefe wrote:
x <- rnorm(samples.per.day * 365) length(x)
[1] 105120 Reshape the fake data into a matrix where each row represents one 24-hour period.
m <- matrix(x, ncol=samples.per.day, byrow=TRUE)
Richard, Now I understand the need to keep the date and time as a single datetime column; separately dplyr's sumamrize() provides daily means (too many data points to plot over 3-5 years). I reformatted the data to provide a sampledatetime column and a values column. If I correctly understand the output of as.POSIXlt each date and time element is separate, so input such as 2016-03-03 12:00 would now be 2016 03 03 12 00 (I've not read how the elements are separated). (The TZ is not important because all data are either PST or PDT.)
Now we can summarise the rows any way we want. The basic tool here is ?apply. ?rowMeans is said to be faster than using apply to calculate means, so we'll use that. There is no *rowSds so we have to use apply for the standard deviation. I use ?head because I don't want to post tens of thousands of meaningless numbers.
If I create a matrix using the above syntax the resulting rows contain all recorded values for a specific day. What would be the syntax to collect all values for each month? This would result in 12 rows per year; the periods of record for the five variables availble from that gauge station vary in length. Regards, Rich