Skip to content
Prev 388975 / 398506 Next

Calculate daily means from 5-minute interval data

On Sun, 29 Aug 2021, Jeff Newmiller wrote:

            
Jeff,

I've read a number of docs discussing dplyr's summerize and group_by
functions (including that section of Hadley's 'R for Data Science' book, yet
I'm missing something; I think that I need to separate the single sampdate
column into colums for year, month, and day and group_by year/month
summarizing within those groups.

The data are of this format:
sampdate,samptime,cfs
2020-08-26,09:30,136000
2020-08-26,09:35,126000
2020-08-26,09:40,130000
2020-08-26,09:45,128000
2020-08-26,09:50,126000
2020-08-26,09:55,125000
2020-08-26,10:00,121000
2020-08-26,10:05,117000
2020-08-26,10:10,120000

My curent script is:

-------8<--------------
library('tidyverse')

discharge <- read.table('../data/discharge.dat', header = TRUE, sep = ',', stringsAsFactors = TRUE)
discharge$sampdate <- as.Date(discharge$sampdate)
discharge$cfs <- as.numeric(discharge$cfs, length = 6)

# use dplyr.summarize grouped by date

# need to separate sampdate into %Y-%M-%D in order to group_by the month?
by_month <- discharge %>%
   group_by(sampdate ...
summarize(by_month, exp_value = mean(cfs, na.rm = TRUE), sd(cfs))
---------------->8--------

and the results are:
'data.frame':	93254 obs. of  3 variables:
  $ sampdate: Date, format: "2020-08-26" "2020-08-26" ...
  $ samptime: Factor w/ 728 levels "00:00","00:05",..: 115 116 117 118 123 128 133 138 143 148 ...
  $ cfs     : num  176 156 165 161 156 154 144 137 142 142 ...
[1] "by_month"  "discharge"
# A tibble: 93,254 ? 3
# Groups:   sampdate [322]
    sampdate   samptime   cfs
    <date>     <fct>    <dbl>
  1 2020-08-26 09:30      176
  2 2020-08-26 09:35      156
  3 2020-08-26 09:40      165
  4 2020-08-26 09:45      161
  5 2020-08-26 09:50      156
  6 2020-08-26 09:55      154
  7 2020-08-26 10:00      144
  8 2020-08-26 10:05      137
  9 2020-08-26 10:10      142
10 2020-08-26 10:15      142
# ? with 93,244 more rows

I don't know why the discharge values are truncated to 3 digits when they're
6 digits in the input data.

Suggested readings appreciated,

Rich

Thread (25 messages)

Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Eric Berger Calculate daily means from 5-minute interval data Aug 29 Jeff Newmiller Calculate daily means from 5-minute interval data Aug 29 Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Jeff Newmiller Calculate daily means from 5-minute interval data Aug 29 Rui Barradas Calculate daily means from 5-minute interval data Aug 29 Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Rui Barradas Calculate daily means from 5-minute interval data Aug 29 Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Andrew Simmons Calculate daily means from 5-minute interval data Aug 29 Rich Shepard Calculate daily means from 5-minute interval data Aug 29 Richard O'Keefe Calculate daily means from 5-minute interval data Aug 29 Jeff Newmiller Calculate daily means from 5-minute interval data Aug 29 Richard O'Keefe Calculate daily means from 5-minute interval data Aug 30 Rich Shepard Calculate daily means from 5-minute interval data Aug 30 Richard O'Keefe Calculate daily means from 5-minute interval data Aug 30 Rich Shepard Calculate daily means from 5-minute interval data Aug 30 Avi Gross Calculate daily means from 5-minute interval data Aug 30 Bert Gunter Calculate daily means from 5-minute interval data Aug 30 Richard O'Keefe Calculate daily means from 5-minute interval data Aug 30 Rich Shepard Calculate daily means from 5-minute interval data Aug 31 Rich Shepard Calculate daily means from 5-minute interval data Aug 31 Jeff Newmiller Calculate daily means from 5-minute interval data Aug 31