Back to formatted view
Raw Message

Message-ID: <alpine.LNX.2.20.2109131600180.10716@salmo.appl-ecosys.com>
Date: 2021-09-13T23:04:28Z
From: Rich Shepard
Subject: tidyverse: grouped summaries (with summarize) [RESOLVED]
In-Reply-To: <03e801d7a8ef$be336b90$3a9a42b0$@verizon.net>

On Mon, 13 Sep 2021, Avi Gross via R-help wrote:

> As Eric has pointed out, perhaps Rich is not thinking pipelined. Summarize() takes a first argument as:
> 	summarise(.data=whatever, ...)
>
> But in a pipeline, you OMIT the first argument and let the pipeline supply an argument silently.

Avi,

Thank you. I read your message carefully and re-read the example on the
bottom of page 60 and top of page 61. Then changed the command to:
disc_by_month = disc %>%
     group_by(year, month) %>%
     summarize(vol = mean(cfs, na.rm = TRUE))

And, the script now returns what I need:
> disc_by_month
# A tibble: 66 ? 3
# Groups:   year [7]
     year month     vol
    <int> <int>   <dbl>
  1  2016     3 221840.
  2  2016     4 288589.
  3  2016     5 255164.
  4  2016     6 205371.
  5  2016     7 167252.
  6  2016     8 140465.
  7  2016     9  97779.
  8  2016    10 135482.
  9  2016    11 166808.
10  2016    12 165787.

I missed the beginning of the command where the resulting dataframe needs to
be named first.

This clarifies my understanding and I appreciate your and Eric's help.

Regards,

Rich