I have a data frame called test shown below that i would like to summarize in a particular way : I want to show the column sums (columns y ,f) grouped by country (column e1). However, I'm looking for the data to be split according to column e2. In other words, two tables of sum by country. One table for "con" and one table for "std" shown in column e2. Finally at the bottom of the two tables, I would like the overall sum /Totals for all the countries for the two columns (y,f). The lay outs for the two tables I'm looking for are also shown below in case my description isn't completely clear I would also like to be able to use the Totals of y and f for the two tables in other calculations. I can get the two sets of totals with the following commands but not the sums by country. colSums(test[test$e2=="std", c(3,4)]) colSums(test[test$e2=="con", c(3,4)]) I know there's an easy way to do this with a combination of colSums, by, aggregate but I can't seem to get it. std y f usa sum sum france sum sum can sum sum italy sum sum Totals sum sum con y f usa sum sum france sum sum can sum sum italy sum sum Totals sum sum e1 e2 y f 1 usa std 1 1 2 usa std 1 2 3 can con 1 3 4 france con 1 4 5 can std 1 5 6 italy con 1 6 7 usa std 2 7 8 usa std 2 8 9 can con 2 9 10 france con 2 10 11 can std 2 11 12 italy con 2 12 13 usa std 3 13 14 usa std 3 14 15 can con 3 15 16 france con 3 16 17 can std 3 17 18 italy con 3 18 19 usa std 4 19 20 usa std 4 20 21 can con 4 21 22 france con 4 22 23 can std 4 23 24 italy con 4 24 25 usa std 5 25 26 usa std 5 26 27 can con 5 27 28 france con 5 28 29 can std 5 29 30 italy con 5 30 31 usa std 6 31 32 usa std 6 32 33 can con 6 33 34 france con 6 34 35 can std 6 35 36 italy con 6 36 -- View this message in context: http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html Sent from the R help mailing list archive at Nabble.com.
I hope this will help you get going b <- sapply(unique(test$e2), function(x) { out <- aggregate(cbind(y,f)~e1, subset(test, e2==x),"sum") out <- rbind(out, data.frame(e1="total", y=sum(out$y), f=sum(out$f))) out <- list(out) names(out) <- x out })> b$std e1 y f 1 can 21 120 2 usa 42 198 3 total 63 318 $con e1 y f 1 can 21 108 2 france 21 114 3 italy 21 126 4 total 63 348 On 08/14/2011 12:20 PM, eric wrote:> I have a data frame called test shown below that i would like to summarize in > a particular way : > > I want to show the column sums (columns y ,f) grouped by country (column > e1). However, I'm looking for the data to be split according to column e2. > In other words, two tables of sum by country. One table for "con" and one > table for "std" shown in column e2. Finally at the bottom of the two tables, > I would like the overall sum /Totals for all the countries for the two > columns (y,f). The lay outs for the two tables I'm looking for are also > shown below in case my description isn't completely clear > > I would also like to be able to use the Totals of y and f for the two tables > in other calculations. > > I can get the two sets of totals with the following commands but not the > sums by country. > > colSums(test[test$e2=="std", c(3,4)]) > colSums(test[test$e2=="con", c(3,4)]) > > I know there's an easy way to do this with a combination of colSums, by, > aggregate but I can't seem to get it. > > std y f > > usa sum sum > france sum sum > can sum sum > italy sum sum > Totals sum sum > > con y f > > usa sum sum > france sum sum > can sum sum > italy sum sum > Totals sum sum > > e1 e2 y f > 1 usa std 1 1 > 2 usa std 1 2 > 3 can con 1 3 > 4 france con 1 4 > 5 can std 1 5 > 6 italy con 1 6 > 7 usa std 2 7 > 8 usa std 2 8 > 9 can con 2 9 > 10 france con 2 10 > 11 can std 2 11 > 12 italy con 2 12 > 13 usa std 3 13 > 14 usa std 3 14 > 15 can con 3 15 > 16 france con 3 16 > 17 can std 3 17 > 18 italy con 3 18 > 19 usa std 4 19 > 20 usa std 4 20 > 21 can con 4 21 > 22 france con 4 22 > 23 can std 4 23 > 24 italy con 4 24 > 25 usa std 5 25 > 26 usa std 5 26 > 27 can con 5 27 > 28 france con 5 28 > 29 can std 5 29 > 30 italy con 5 30 > 31 usa std 6 31 > 32 usa std 6 32 > 33 can con 6 33 > 34 france con 6 34 > 35 can std 6 35 > 36 italy con 6 36 > > -- > View this message in context: http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi eric, Try lapply(with(x, split(x, e2)), function(l){ r <- with(l, aggregate(list(y, f), list(e1), sum)) colnames(r) <- c('e1', 'y', 'f') r }) HTH, Jorge On Sun, Aug 14, 2011 at 1:20 PM, eric <> wrote:> I have a data frame called test shown below that i would like to summarize > in > a particular way : > > I want to show the column sums (columns y ,f) grouped by country (column > e1). However, I'm looking for the data to be split according to column e2. > In other words, two tables of sum by country. One table for "con" and one > table for "std" shown in column e2. Finally at the bottom of the two > tables, > I would like the overall sum /Totals for all the countries for the two > columns (y,f). The lay outs for the two tables I'm looking for are also > shown below in case my description isn't completely clear > > I would also like to be able to use the Totals of y and f for the two > tables > in other calculations. > > I can get the two sets of totals with the following commands but not the > sums by country. > > colSums(test[test$e2=="std", c(3,4)]) > colSums(test[test$e2=="con", c(3,4)]) > > I know there's an easy way to do this with a combination of colSums, by, > aggregate but I can't seem to get it. > > std y f > > usa sum sum > france sum sum > can sum sum > italy sum sum > Totals sum sum > > con y f > > usa sum sum > france sum sum > can sum sum > italy sum sum > Totals sum sum > > e1 e2 y f > 1 usa std 1 1 > 2 usa std 1 2 > 3 can con 1 3 > 4 france con 1 4 > 5 can std 1 5 > 6 italy con 1 6 > 7 usa std 2 7 > 8 usa std 2 8 > 9 can con 2 9 > 10 france con 2 10 > 11 can std 2 11 > 12 italy con 2 12 > 13 usa std 3 13 > 14 usa std 3 14 > 15 can con 3 15 > 16 france con 3 16 > 17 can std 3 17 > 18 italy con 3 18 > 19 usa std 4 19 > 20 usa std 4 20 > 21 can con 4 21 > 22 france con 4 22 > 23 can std 4 23 > 24 italy con 4 24 > 25 usa std 5 25 > 26 usa std 5 26 > 27 can con 5 27 > 28 france con 5 28 > 29 can std 5 29 > 30 italy con 5 30 > 31 usa std 6 31 > 32 usa std 6 32 > 33 can con 6 33 > 34 france con 6 34 > 35 can std 6 35 > 36 italy con 6 36 > > -- > View this message in context: > http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi: Here's another approach using the reshape2 package. I called your data frame dat in the code below. library('reshape2') mdat <- melt(dat, measure = c('y', 'f')) acast(mdat, e1 ~ variable ~ e2, fun = sum, margins = 'e1') , , con y f can 21 108 france 21 114 italy 21 126 usa 0 0 (all) 63 348 , , std y f can 21 120 france 0 0 italy 0 0 usa 42 198 (all) 63 318 This returns an array rather than a list. HTH, Dennis On Sun, Aug 14, 2011 at 10:20 AM, eric <ericstrom at aol.com> wrote:> I have a data frame called test shown below that i would like to summarize in > a particular way : > > I want to show the column sums (columns y ,f) grouped by country (column > e1). However, I'm looking for the data to be split according to column e2. > In other words, two tables of sum by country. One table for "con" and one > table for "std" shown in column e2. Finally at the bottom of the two tables, > I would like the overall sum /Totals for all the countries for the two > columns (y,f). ?The lay outs for the two tables I'm looking for are also > shown below in case my description isn't completely clear > > I would also like to be able to use the Totals of y and f for the two tables > in other calculations. > > I can get the two sets of totals with the following commands but not the > sums by country. > > colSums(test[test$e2=="std", c(3,4)]) > colSums(test[test$e2=="con", c(3,4)]) > > I know there's an easy way to do this with a combination of colSums, by, > aggregate but I can't seem to get it. > > std ? ? ? ? y ? ? ? f > > usa ? ? ? ?sum ? sum > france ? ?sum ? sum > can ? ? ? ?sum ? sum > italy ? ? ? sum ? sum > Totals ? ?sum ? sum > > con ? ? ? y ? ? ? f > > usa ? ? ? sum ? sum > france ? sum ? sum > can ? ? ? sum ? sum > italy ? ? ?sum ? sum > Totals ? ?sum ? sum > > ? ? ?e1 ?e2 y ?f > 1 ? ? usa std 1 ?1 > 2 ? ? usa std 1 ?2 > 3 ? ? can con 1 ?3 > 4 ?france con 1 ?4 > 5 ? ? can std 1 ?5 > 6 ? italy con 1 ?6 > 7 ? ? usa std 2 ?7 > 8 ? ? usa std 2 ?8 > 9 ? ? can con 2 ?9 > 10 france con 2 10 > 11 ? ?can std 2 11 > 12 ?italy con 2 12 > 13 ? ?usa std 3 13 > 14 ? ?usa std 3 14 > 15 ? ?can con 3 15 > 16 france con 3 16 > 17 ? ?can std 3 17 > 18 ?italy con 3 18 > 19 ? ?usa std 4 19 > 20 ? ?usa std 4 20 > 21 ? ?can con 4 21 > 22 france con 4 22 > 23 ? ?can std 4 23 > 24 ?italy con 4 24 > 25 ? ?usa std 5 25 > 26 ? ?usa std 5 26 > 27 ? ?can con 5 27 > 28 france con 5 28 > 29 ? ?can std 5 29 > 30 ?italy con 5 30 > 31 ? ?usa std 6 31 > 32 ? ?usa std 6 32 > 33 ? ?can con 6 33 > 34 france con 6 34 > 35 ? ?can std 6 35 > 36 ?italy con 6 36 > > -- > View this message in context: http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >