Hi, I want to run something like SELECT firm_id, count(*), mean(value), sd(value) FROM table GROUP BY firm_id; But I have to write a for loop like for ( id in unique(table$firm_id ) { print(paste( id, mean(table[firm_id == id, "value"]) )) } Is there any way to do it easier? Thanks :) Best, Reeyarn Lee [[alternative HTML version deleted]]
This is a very commonly used paradigm in R and there are many different answers, one would be tapply(). You could also look at the plyr package. Michael On Sat, Mar 24, 2012 at 11:32 AM, reeyarn <reeyarn at gmail.com> wrote:> Hi, I want to run something like > ?SELECT firm_id, count(*), mean(value), sd(value) > ?FROM table > ?GROUP BY firm_id; > > But I have to write a for loop like > ?for ( id in unique(table$firm_id ) { > ? ?print(paste( ?id, mean(table[firm_id == id, "value"]) ?)) > ?} > > Is there any way to do it easier? Thanks :) > > > Best, > Reeyarn Lee > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hello, Reeyarn_???_10928113 wrote> > Hi, I want to run something like > SELECT firm_id, count(*), mean(value), sd(value) > FROM table > GROUP BY firm_id; > > But I have to write a for loop like > for ( id in unique(table$firm_id ) { > print(paste( id, mean(table[firm_id == id, "value"]) )) > } > > Is there any way to do it easier? Thanks :) > > > Best, > Reeyarn Lee > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@ mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >You can use sql with package 'sqldf'. library(sqldf) # It needs package 'tcltk' ?sqldf tbl <- data.frame(firm_id=rep(c(1,2), 10), value=rnorm(20)) sqldf("SELECT firm_id, count(*), avg(value), stdev(value) FROM tbl GROUP BY firm_id;") I've changed the name of your data.frame because 'table' is an R function. Hope this helps, Rui Barradas -- View this message in context: http://r.789695.n4.nabble.com/How-to-compute-within-group-mean-and-sd-tp4501504p4501709.html Sent from the R help mailing list archive at Nabble.com.
In addition to Michael's answers, there are packages that allow you to use SQL syntax on R data objects, so you could probably just use what you are familiar with. On Sat, Mar 24, 2012 at 9:32 AM, reeyarn <reeyarn at gmail.com> wrote:> Hi, I want to run something like > ?SELECT firm_id, count(*), mean(value), sd(value) > ?FROM table > ?GROUP BY firm_id; > > But I have to write a for loop like > ?for ( id in unique(table$firm_id ) { > ? ?print(paste( ?id, mean(table[firm_id == id, "value"]) ?)) > ?} > > Is there any way to do it easier? Thanks :) > > > Best, > Reeyarn Lee > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com