I sent Val a longer reply but for anyone else here, please note Val was copying my OUTPUT with continuation lines starting with plus so of course that has issues! The naked code to use was this: dat %>% group_by(Year, Sex) %>% summarize( M = mean(wt, na.rm=TRUE)) %>% pivot_wider(names_from = Sex, values_from = M) %>% as.data.frame %>% round(1) -----Original Message----- From: Val <valkremk at gmail.com> Sent: Monday, November 1, 2021 8:15 PM To: Avi Gross <avigross at verizon.net> Cc: r-help mailing list <r-help at r-project.org> Subject: Re: [R] by group Thank you Avi, One question, I am getting this error from this script> dat %>%+ + group_by(Year, Sex) %>% + + summarize( M = mean(wt, na.rm=TRUE)) %>% + + pivot_wider(names_from = Sex, values_from = M) %>% + + as.data.frame %>% + + round(1) Error in group_by(Year, Sex) : object 'Year' not found Why I am getting this? On Mon, Nov 1, 2021 at 7:07 PM Avi Gross via R-help <r-help at r-project.org> wrote:> > Understood Val. So you need to save the output in something like a data.frame which can then be saved as a CSV file or whatever else makes sense to be read in by a later program. As note by() does not produce the output in a usable way. > > But you mentioned efficient, and that is another whole ball of wax. For small amounts of data it may not matter much. And some processes may look slower but turn out be more efficient if compiled as C/C++ or ... > > Sometimes it might be more efficient to change the format of your data before the analysis, albeit if the output is much smaller, maybe best later. > > Good luck. > > -----Original Message----- > From: Val <valkremk at gmail.com> > Sent: Monday, November 1, 2021 7:44 PM > To: Avi Gross <avigross at verizon.net> > Cc: r-help mailing list <r-help at r-project.org> > Subject: Re: [R] by group > > Thank you all! > I can assure you that this is not HW. This is a sample of my large data set and I want a simple and efficient approach to get the > desired output in that particular format. That file will be saved > and used as an input file for another external process. > > val > > > > > > > > On Mon, Nov 1, 2021 at 6:08 PM Avi Gross via R-help <r-help at r-project.org> wrote: > > > > Jim, > > > > Your code gives the output in quite a different format and as an > > object of class "by" that is not easily convertible to a data.frame. > > So, yes, it is an answer that produces the right numbers but not in > > the places or data structures I think they (or if it is HW ...) wanted. > > > > Trivial standard cases are often handled by a single step but more > > complex ones often suggest a multi-part approach. > > > > Of course Val gets to decide what approach works best for them > > within whatever constraints we here are not made aware of. If this > > is a class assignment, it likely would be using only tools discussed > > in the class. So I would not suggest using a dplyr/tidyverse > > approach if that is not covered or even part of a class. If this is > > a project in the real world, it becomes a matter of programming taste and convenience and so on. > > > > Maybe Val can share more about the situation so we can see what is > > helpful and what is not. Realistically, I can think of way too many > > ways to get the required output. > > > > -----Original Message----- > > From: R-help <r-help-bounces at r-project.org> On Behalf Of Jim Lemon > > Sent: Monday, November 1, 2021 6:25 PM > > To: Val <valkremk at gmail.com>; r-help mailing list > > <r-help at r-project.org> > > Subject: Re: [R] by group > > > > Hi Val, > > I think you answered your own question: > > > > by(dat$wt,dat[,c("Sex","Year")],mean) > > > > Jim > > > > On Tue, Nov 2, 2021 at 8:09 AM Val <valkremk at gmail.com> wrote: > > > > > > Hi All, > > > > > > How can I generate mean by group. The sample data looks like as > > > follow, dat<-read.table(text="Year Sex wt > > > 2001 M 15 > > > 2001 M 14 > > > 2001 M 16 > > > 2001 F 12 > > > 2001 F 11 > > > 2001 F 13 > > > 2002 M 14 > > > 2002 M 18 > > > 2002 M 17 > > > 2002 F 11 > > > 2002 F 15 > > > 2002 F 14 > > > 2003 M 18 > > > 2003 M 13 > > > 2003 M 14 > > > 2003 F 15 > > > 2003 F 10 > > > 2003 F 11 ",header=TRUE) > > > > > > The desired output is, > > > M F > > > 2001 15 12 > > > 2002 16.33 13.33 > > > 2003 15 12 > > > > > > Thank you, > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dear Val, also consider using reshape2::dcast dat <- structure(list(Year = c(2001L, 2001L, 2001L, 2001L, 2001L, 2001L, 2002L, 2002L, 2002L, 2002L, 2002L, 2002L, 2003L, 2003L, 2003L, 2003L, 2003L, 2003L), Sex = c("M", "M", "M", "F", "F", "F", "M", "M", "M", "F", "F", "F", "M", "M", "M", "F", "F", "F"), wt c(15L, 14L, 16L, 12L, 11L, 13L, 14L, 18L, 17L, 11L, 15L, 14L, 18L, 13L, 14L, 15L, 10L, 11L)), class = "data.frame", row.names = c(NA, -18L)) reshape2::dcast(data=dat, formula=Year~Sex, value.var="wt", fun.aggregate=mean) yielding Year F M 1 2001 12.00000 15.00000 2 2002 13.33333 16.33333 3 2003 12.00000 15.00000 Best, Rasmus