Hello, I cannot reproduce this error with a built-in data set. Can you post str(my_tbl)? suppressPackageStartupMessages(library(dplyr)) mtcars %>% mutate(hp = round(hp * 2) / 2) %>% group_by(cyl, hp) %>% summarise( count = n(), hp = mean(hp), stdev = sd(hp) ) #> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` #> argument. #> # A tibble: 23 x 4 #> # Groups: cyl [3] #> cyl hp count stdev #> <dbl> <dbl> <int> <dbl> #> 1 4 52 1 NA #> 2 4 62 1 NA #> 3 4 65 1 NA #> 4 4 66 2 NA #> 5 4 91 1 NA #> 6 4 93 1 NA #> 7 4 95 1 NA #> 8 4 97 1 NA #> 9 4 109 1 NA #> 10 4 113 1 NA #> # ... with 13 more rows Hope this helps, Rui Barradas ?s 14:14 de 11/03/2022, Jeff Reichman escreveu:> r-help forum > > > > When I run the following code > > > > my_tbl %>% > > mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>% > > group_by(Cat, Bse_bwt) %>% > > summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = sd(Bse_ftv)) > > > > I get the following error: > > > > Error: `stdev` refers to a variable created earlier in this summarise(). > > Do you need an extra mutate() step? > > > > I suspect it is because the standard deviation of a length-one vector is NA > and R is errorerrors out on the standard deviation of 1. So then I tried > > > > summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = if(n()>1) > sd(Bse_ftv) else 0) and this didn't seem to work either. So there has to be > a way to add some sort of error checker to my standard deviation function to > check if n > 1 and then take the standard deviation in dplyr. > > > > Jeff > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui I don't have the data with med. But I did try using the mtcars dataset you used mtcars %>% mutate(hp = round(hp * 2) / 2) %>% group_by(gear, carb) %>% summarise(count = n(), mean_hp = mean(hp), stdev_hp = sd(hp)) which resulted in # A tibble: 11 x 5 # Groups: gear [3] gear carb count mean_hp stdev_hp <dbl> <dbl> <int> <dbl> <dbl> 1 3 1 3 104 6.56 2 3 2 4 162. 14.4 3 3 3 3 180 0 4 3 4 5 228 17.9 5 4 1 4 72.5 13.7 6 4 2 4 79.5 26.9 7 4 4 4 116. 7.51 8 5 2 2 102 15.6 9 5 4 1 264 NA 10 5 6 1 175 NA 11 5 8 1 335 NA So maybe there is something odd with my dataset. Because the mtcars dataset code ran just fine. Where count == 1 sd returned NA. Which is what I was expecting originally -----Original Message----- From: Rui Barradas <ruipbarradas at sapo.pt> Sent: Friday, March 11, 2022 9:24 AM To: reichmanj at sbcglobal.net; r-help at r-project.org Subject: Re: [R] stdev error Hello, I cannot reproduce this error with a built-in data set. Can you post str(my_tbl)? suppressPackageStartupMessages(library(dplyr)) mtcars %>% mutate(hp = round(hp * 2) / 2) %>% group_by(cyl, hp) %>% summarise( count = n(), hp = mean(hp), stdev = sd(hp) ) #> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` #> argument. #> # A tibble: 23 x 4 #> # Groups: cyl [3] #> cyl hp count stdev #> <dbl> <dbl> <int> <dbl> #> 1 4 52 1 NA #> 2 4 62 1 NA #> 3 4 65 1 NA #> 4 4 66 2 NA #> 5 4 91 1 NA #> 6 4 93 1 NA #> 7 4 95 1 NA #> 8 4 97 1 NA #> 9 4 109 1 NA #> 10 4 113 1 NA #> # ... with 13 more rows Hope this helps, Rui Barradas ?s 14:14 de 11/03/2022, Jeff Reichman escreveu:> r-help forum > > > > When I run the following code > > > > my_tbl %>% > > mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>% > > group_by(Cat, Bse_bwt) %>% > > summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = > sd(Bse_ftv)) > > > > I get the following error: > > > > Error: `stdev` refers to a variable created earlier in this summarise(). > > Do you need an extra mutate() step? > > > > I suspect it is because the standard deviation of a length-one vector > is NA and R is errorerrors out on the standard deviation of 1. So > then I tried > > > > summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = if(n()>1) > sd(Bse_ftv) else 0) and this didn't seem to work either. So there has > to be a way to add some sort of error checker to my standard deviation > function to check if n > 1 and then take the standard deviation in dplyr. > > > > Jeff > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Found my problem, or at least I think I found the problem. # BEWARE: reusing variables may lead to unexpected results - https://dplyr.tidyverse.org/reference/summarise.html I changed my variable name and problem resolved. Jeff -----Original Message----- From: Rui Barradas <ruipbarradas at sapo.pt> Sent: Friday, March 11, 2022 9:24 AM To: reichmanj at sbcglobal.net; r-help at r-project.org Subject: Re: [R] stdev error Hello, I cannot reproduce this error with a built-in data set. Can you post str(my_tbl)? suppressPackageStartupMessages(library(dplyr)) mtcars %>% mutate(hp = round(hp * 2) / 2) %>% group_by(cyl, hp) %>% summarise( count = n(), hp = mean(hp), stdev = sd(hp) ) #> `summarise()` has grouped output by 'cyl'. You can override using the `.groups` #> argument. #> # A tibble: 23 x 4 #> # Groups: cyl [3] #> cyl hp count stdev #> <dbl> <dbl> <int> <dbl> #> 1 4 52 1 NA #> 2 4 62 1 NA #> 3 4 65 1 NA #> 4 4 66 2 NA #> 5 4 91 1 NA #> 6 4 93 1 NA #> 7 4 95 1 NA #> 8 4 97 1 NA #> 9 4 109 1 NA #> 10 4 113 1 NA #> # ... with 13 more rows Hope this helps, Rui Barradas ?s 14:14 de 11/03/2022, Jeff Reichman escreveu:> r-help forum > > > > When I run the following code > > > > my_tbl %>% > > mutate(Bse_bwt = round(Bse_bwt * 2) / 2) %>% > > group_by(Cat, Bse_bwt) %>% > > summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = > sd(Bse_ftv)) > > > > I get the following error: > > > > Error: `stdev` refers to a variable created earlier in this summarise(). > > Do you need an extra mutate() step? > > > > I suspect it is because the standard deviation of a length-one vector > is NA and R is errorerrors out on the standard deviation of 1. So > then I tried > > > > summarize(count = n(), Bse_ftv = mean(Bse_ftv), stdev = if(n()>1) > sd(Bse_ftv) else 0) and this didn't seem to work either. So there has > to be a way to add some sort of error checker to my standard deviation > function to check if n > 1 and then take the standard deviation in dplyr. > > > > Jeff > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.