thr3ads.net - R help - [R] tidyverse: grouped summaries (with summerize) [Sep 2021]

If this information is useful, please help other people find it:
Share via:

Rich Shepard

2021-Sep-13 21:50 UTC

[R] tidyverse: grouped summaries (with summerize)

On Mon, 13 Sep 2021, Rich Shepard wrote:
> That's what I thought I did. I'll rewrite the script and work
toward the
> output I need.
Still not the correct syntax. Command is now:
disc_by_month %>%
     group_by(year, month) %>%
     summarize(disc_by_month, vol = mean(cfs, na.rm = TRUE))

and results are:> source('disc.R')`summarise()` has grouped output by 'year', 'month'. You can
override using the `.groups` argument.
> disc_by_month# A tibble: 590,940 ? 6
# Groups:   year, month [66]
     year month   day  hour   min    cfs
    <int> <int> <int> <int> <int>  <dbl>
  1  2016     3     3    12     0 149000
  2  2016     3     3    12    10 150000
  3  2016     3     3    12    20 151000
  4  2016     3     3    12    30 156000
  5  2016     3     3    12    40 154000
  6  2016     3     3    12    50 150000
  7  2016     3     3    13     0 153000
  8  2016     3     3    13    10 156000
  9  2016     3     3    13    20 154000
10  2016     3     3    13    30 155000
# ? with 590,930 more rows

The grouping is still not right. I expected to see a mean value for each
month of each year in the data set, not for each minute.

Rich

Eric Berger

2021-Sep-13 21:56 UTC

head link

[R] tidyverse: grouped summaries (with summerize)

This code is not correct:
disc_by_month %>%
     group_by(year, month) %>%
     summarize(disc_by_month, vol = mean(cfs, na.rm = TRUE))

It should be:

disc %>% group_by(year,month) %>% summarize(vol=mean(cfs,na.rm=TRUE)





On Tue, Sep 14, 2021 at 12:51 AM Rich Shepard <rshepard at
appl-ecosys.com>
wrote:
> On Mon, 13 Sep 2021, Rich Shepard wrote:
>
> > That's what I thought I did. I'll rewrite the script and work
toward the
> > output I need.
>
> Still not the correct syntax. Command is now:
> disc_by_month %>%
>      group_by(year, month) %>%
>      summarize(disc_by_month, vol = mean(cfs, na.rm = TRUE))
>
> and results are:
> > source('disc.R')
> `summarise()` has grouped output by 'year', 'month'. You
can override
> using the `.groups` argument.
>
> > disc_by_month
> # A tibble: 590,940 ? 6
> # Groups:   year, month [66]
>      year month   day  hour   min    cfs
>     <int> <int> <int> <int> <int> 
<dbl>
>   1  2016     3     3    12     0 149000
>   2  2016     3     3    12    10 150000
>   3  2016     3     3    12    20 151000
>   4  2016     3     3    12    30 156000
>   5  2016     3     3    12    40 154000
>   6  2016     3     3    12    50 150000
>   7  2016     3     3    13     0 153000
>   8  2016     3     3    13    10 156000
>   9  2016     3     3    13    20 154000
> 10  2016     3     3    13    30 155000
> # ? with 590,930 more rows
>
> The grouping is still not right. I expected to see a mean value for each
> month of each year in the data set, not for each minute.
>
> Rich
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Avi Gross

2021-Sep-13 22:36 UTC

head link

[R] tidyverse: grouped summaries (with summerize)

As Eric has pointed out, perhaps Rich is not thinking pipelined. Summarize()
takes a first argument as:
	summarise(.data=whatever, ...)

But in a pipeline, you OMIT the first argument and let the pipeline supply an
argument silently.

What I think summarize saw was something like:

summarize(. , disc_by_month, vol = mean(cfs, na.rm = TRUE))

There is now a superfluous SECOND argument in a place it expected not a
data.frame type of variable but the name of a column in the hidden
data.frame-like object it was passed. You do not have a column called
disc_by_month and presumably some weird logic made it suggest it was replacing
that by the first column or something.

I hope this makes sense. You do not cobble a pipeline together from parts
without carefully making sure all first arguments otherwise used are NOT used.

And, just FYI, the subject line should not use a word that some see as the
opposite companion of "winterize" ...

-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Rich Shepard
Sent: Monday, September 13, 2021 5:51 PM
To: r-help at r-project.org
Subject: Re: [R] tidyverse: grouped summaries (with summerize)

On Mon, 13 Sep 2021, Rich Shepard wrote:
> That's what I thought I did. I'll rewrite the script and work
toward
> the output I need.
Still not the correct syntax. Command is now:
disc_by_month %>%
     group_by(year, month) %>%
     summarize(disc_by_month, vol = mean(cfs, na.rm = TRUE))

and results are:> source('disc.R')`summarise()` has grouped output by 'year', 'month'. You can
override using the `.groups` argument.
> disc_by_month# A tibble: 590,940 ? 6
# Groups:   year, month [66]
     year month   day  hour   min    cfs
    <int> <int> <int> <int> <int>  <dbl>
  1  2016     3     3    12     0 149000
  2  2016     3     3    12    10 150000
  3  2016     3     3    12    20 151000
  4  2016     3     3    12    30 156000
  5  2016     3     3    12    40 154000
  6  2016     3     3    12    50 150000
  7  2016     3     3    13     0 153000
  8  2016     3     3    13    10 156000
  9  2016     3     3    13    20 154000
10  2016     3     3    13    30 155000
# ? with 590,930 more rows

The grouping is still not right. I expected to see a mean value for each month
of each year in the data set, not for each minute.

Rich

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R help - Sep 2021 - tidyverse: grouped summaries (with summerize)

[R] tidyverse: grouped summaries (with summerize)

[R] tidyverse: grouped summaries (with summerize)

[R] tidyverse: grouped summaries (with summerize)