Afshartous, David
2007-Jul-05 16:17 UTC
[R] summarizing dataframe at variable/factor levels
All, Is there an efficient way to apply say "mean" or "median" to a dataframe according to say all combinations of two variables in the dataframe? Below is a simple example and the outline of a "manual" solution that will work but is not very efficient (could also generalize this to a function). Searched the archives and docs but didn't see anything close to this question. Cheers, dave dat.ex = data.frame( rep(c(1:6), each=6), c(rnorm(12), rnorm(12, 1), rnorm(12, 2)), rnorm(36, 5), rep(c(1:6), 6), rep(c("Drug1", "Drug2", "Placebo"), each=12) ) names(dat.ex) = c("patient.no", "outcome", "x", "time", "drug") mean of first 2 time pts on Drug1: mean.time.1.drug.1 = mean( dat.ex[dat.ex$time==1 & dat.ex$drug=="Drug1", c(2,3)]) mean.time.2.drug.1 = mean( dat.ex[dat.ex$time==2 & dat.ex$drug=="Drug1", c(2,3)]) dat.ex.reduced = as.data.frame(rbind(mean.time.1.drug.1, mean.time.2.drug.1)) dat.ex.reduced$Drug = c("Drug1", "Drug1") ## add back Drug variable and time variable dat.ex.reduced$time = c(1,2)
my.data <- data.frame( trts <- rep(c('Drug 1','Drug2'), each = 10), doses <- rep(c('Low dose','High dose'), 10), resp <- rnorm(20) ) tapply(my.data$resp, list(my.data$trts, my.data$doses), mean) Jim Afshartous, David wrote:> > > All, > > Is there an efficient way to apply say "mean" or "median" to a dataframe > > according to say all combinations of two variables in the dataframe? > Below is a simple example and the outline of a "manual" solution that > will work but is not very efficient > (could also generalize this to a function). Searched the archives and > docs but didn't see anything close to this question. > > Cheers, > dave > > dat.ex = data.frame( rep(c(1:6), each=6), c(rnorm(12), rnorm(12, 1), > rnorm(12, 2)), rnorm(36, 5), rep(c(1:6), 6), > rep(c("Drug1", "Drug2", "Placebo"), each=12) ) > names(dat.ex) = c("patient.no", "outcome", "x", "time", "drug") > > mean of first 2 time pts on Drug1: > mean.time.1.drug.1 = mean( dat.ex[dat.ex$time==1 & dat.ex$drug=="Drug1", > c(2,3)]) > mean.time.2.drug.1 = mean( dat.ex[dat.ex$time==2 & dat.ex$drug=="Drug1", > c(2,3)]) > > dat.ex.reduced = as.data.frame(rbind(mean.time.1.drug.1, > mean.time.2.drug.1)) > dat.ex.reduced$Drug = c("Drug1", "Drug1") ## add back Drug variable and > time variable > dat.ex.reduced$time = c(1,2) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/summarizing-dataframe-at-variable-factor-levels-tf4030788.html#a11450001 Sent from the R help mailing list archive at Nabble.com.
Afshartous, David <afshart <at> exchange.sba.miami.edu> writes:> All, > Is there an efficient way to apply say "mean" or "median" to a dataframe > according to say all combinations of two variables in the dataframe? > ..[snip].. >See function summaryBy in package doBy
Gabor Grothendieck
2007-Jul-05 16:47 UTC
[R] summarizing dataframe at variable/factor levels
Try this: aggregate(dat.ex[2:3], dat.ex[4:5], mean) On 7/5/07, Afshartous, David <afshart at exchange.sba.miami.edu> wrote:> > All, > > Is there an efficient way to apply say "mean" or "median" to a dataframe > > according to say all combinations of two variables in the dataframe? > Below is a simple example and the outline of a "manual" solution that > will work but is not very efficient > (could also generalize this to a function). Searched the archives and > docs but didn't see anything close to this question. > > Cheers, > dave > > dat.ex = data.frame( rep(c(1:6), each=6), c(rnorm(12), rnorm(12, 1), > rnorm(12, 2)), rnorm(36, 5), rep(c(1:6), 6), > rep(c("Drug1", "Drug2", "Placebo"), each=12) ) > names(dat.ex) = c("patient.no", "outcome", "x", "time", "drug") > > mean of first 2 time pts on Drug1: > mean.time.1.drug.1 = mean( dat.ex[dat.ex$time==1 & dat.ex$drug=="Drug1", > c(2,3)]) > mean.time.2.drug.1 = mean( dat.ex[dat.ex$time==2 & dat.ex$drug=="Drug1", > c(2,3)]) > > dat.ex.reduced = as.data.frame(rbind(mean.time.1.drug.1, > mean.time.2.drug.1)) > dat.ex.reduced$Drug = c("Drug1", "Drug1") ## add back Drug variable and > time variable > dat.ex.reduced$time = c(1,2) > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >