I have a data frame called test shown below that i would like to summarize in
a particular way :
I want to show the column sums (columns y ,f) grouped by country (column
e1). However, I'm looking for the data to be split according to column e2.
In other words, two tables of sum by country. One table for "con" and
one
table for "std" shown in column e2. Finally at the bottom of the two
tables,
I would like the overall sum /Totals for all the countries for the two
columns (y,f). The lay outs for the two tables I'm looking for are also
shown below in case my description isn't completely clear
I would also like to be able to use the Totals of y and f for the two tables
in other calculations.
I can get the two sets of totals with the following commands but not the
sums by country.
colSums(test[test$e2=="std", c(3,4)])
colSums(test[test$e2=="con", c(3,4)])
I know there's an easy way to do this with a combination of colSums, by,
aggregate but I can't seem to get it.
std y f
usa sum sum
france sum sum
can sum sum
italy sum sum
Totals sum sum
con y f
usa sum sum
france sum sum
can sum sum
italy sum sum
Totals sum sum
e1 e2 y f
1 usa std 1 1
2 usa std 1 2
3 can con 1 3
4 france con 1 4
5 can std 1 5
6 italy con 1 6
7 usa std 2 7
8 usa std 2 8
9 can con 2 9
10 france con 2 10
11 can std 2 11
12 italy con 2 12
13 usa std 3 13
14 usa std 3 14
15 can con 3 15
16 france con 3 16
17 can std 3 17
18 italy con 3 18
19 usa std 4 19
20 usa std 4 20
21 can con 4 21
22 france con 4 22
23 can std 4 23
24 italy con 4 24
25 usa std 5 25
26 usa std 5 26
27 can con 5 27
28 france con 5 28
29 can std 5 29
30 italy con 5 30
31 usa std 6 31
32 usa std 6 32
33 can con 6 33
34 france con 6 34
35 can std 6 35
36 italy con 6 36
--
View this message in context:
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
Sent from the R help mailing list archive at Nabble.com.
I hope this will help you get going
b <- sapply(unique(test$e2), function(x) {
out <- aggregate(cbind(y,f)~e1, subset(test, e2==x),"sum")
out <- rbind(out, data.frame(e1="total", y=sum(out$y),
f=sum(out$f)))
out <- list(out)
names(out) <- x
out
})
> b
$std
e1 y f
1 can 21 120
2 usa 42 198
3 total 63 318
$con
e1 y f
1 can 21 108
2 france 21 114
3 italy 21 126
4 total 63 348
On 08/14/2011 12:20 PM, eric wrote:> I have a data frame called test shown below that i would like to summarize
in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column
e2.
> In other words, two tables of sum by country. One table for "con"
and one
> table for "std" shown in column e2. Finally at the bottom of the
two tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f). The lay outs for the two tables I'm looking for are
also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums,
by,
> aggregate but I can't seem to get it.
>
> std y f
>
> usa sum sum
> france sum sum
> can sum sum
> italy sum sum
> Totals sum sum
>
> con y f
>
> usa sum sum
> france sum sum
> can sum sum
> italy sum sum
> Totals sum sum
>
> e1 e2 y f
> 1 usa std 1 1
> 2 usa std 1 2
> 3 can con 1 3
> 4 france con 1 4
> 5 can std 1 5
> 6 italy con 1 6
> 7 usa std 2 7
> 8 usa std 2 8
> 9 can con 2 9
> 10 france con 2 10
> 11 can std 2 11
> 12 italy con 2 12
> 13 usa std 3 13
> 14 usa std 3 14
> 15 can con 3 15
> 16 france con 3 16
> 17 can std 3 17
> 18 italy con 3 18
> 19 usa std 4 19
> 20 usa std 4 20
> 21 can con 4 21
> 22 france con 4 22
> 23 can std 4 23
> 24 italy con 4 24
> 25 usa std 5 25
> 26 usa std 5 26
> 27 can con 5 27
> 28 france con 5 28
> 29 can std 5 29
> 30 italy con 5 30
> 31 usa std 6 31
> 32 usa std 6 32
> 33 can con 6 33
> 34 france con 6 34
> 35 can std 6 35
> 36 italy con 6 36
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hi eric,
Try
lapply(with(x, split(x, e2)), function(l){
r <- with(l, aggregate(list(y, f), list(e1), sum))
colnames(r) <- c('e1', 'y', 'f')
r
})
HTH,
Jorge
On Sun, Aug 14, 2011 at 1:20 PM, eric <> wrote:
> I have a data frame called test shown below that i would like to summarize
> in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column
e2.
> In other words, two tables of sum by country. One table for "con"
and one
> table for "std" shown in column e2. Finally at the bottom of the
two
> tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f). The lay outs for the two tables I'm looking for are
also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
> tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums,
by,
> aggregate but I can't seem to get it.
>
> std y f
>
> usa sum sum
> france sum sum
> can sum sum
> italy sum sum
> Totals sum sum
>
> con y f
>
> usa sum sum
> france sum sum
> can sum sum
> italy sum sum
> Totals sum sum
>
> e1 e2 y f
> 1 usa std 1 1
> 2 usa std 1 2
> 3 can con 1 3
> 4 france con 1 4
> 5 can std 1 5
> 6 italy con 1 6
> 7 usa std 2 7
> 8 usa std 2 8
> 9 can con 2 9
> 10 france con 2 10
> 11 can std 2 11
> 12 italy con 2 12
> 13 usa std 3 13
> 14 usa std 3 14
> 15 can con 3 15
> 16 france con 3 16
> 17 can std 3 17
> 18 italy con 3 18
> 19 usa std 4 19
> 20 usa std 4 20
> 21 can con 4 21
> 22 france con 4 22
> 23 can std 4 23
> 24 italy con 4 24
> 25 usa std 5 25
> 26 usa std 5 26
> 27 can con 5 27
> 28 france con 5 28
> 29 can std 5 29
> 30 italy con 5 30
> 31 usa std 6 31
> 32 usa std 6 32
> 33 can con 6 33
> 34 france con 6 34
> 35 can std 6 35
> 36 italy con 6 36
>
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Hi:
Here's another approach using the reshape2 package. I called your data
frame dat in the code below.
library('reshape2')
mdat <- melt(dat, measure = c('y', 'f'))
acast(mdat, e1 ~ variable ~ e2, fun = sum, margins = 'e1')
, , con
y f
can 21 108
france 21 114
italy 21 126
usa 0 0
(all) 63 348
, , std
y f
can 21 120
france 0 0
italy 0 0
usa 42 198
(all) 63 318
This returns an array rather than a list.
HTH,
Dennis
On Sun, Aug 14, 2011 at 10:20 AM, eric <ericstrom at aol.com>
wrote:> I have a data frame called test shown below that i would like to summarize
in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column
e2.
> In other words, two tables of sum by country. One table for "con"
and one
> table for "std" shown in column e2. Finally at the bottom of the
two tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f). ?The lay outs for the two tables I'm looking for are
also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums,
by,
> aggregate but I can't seem to get it.
>
> std ? ? ? ? y ? ? ? f
>
> usa ? ? ? ?sum ? sum
> france ? ?sum ? sum
> can ? ? ? ?sum ? sum
> italy ? ? ? sum ? sum
> Totals ? ?sum ? sum
>
> con ? ? ? y ? ? ? f
>
> usa ? ? ? sum ? sum
> france ? sum ? sum
> can ? ? ? sum ? sum
> italy ? ? ?sum ? sum
> Totals ? ?sum ? sum
>
> ? ? ?e1 ?e2 y ?f
> 1 ? ? usa std 1 ?1
> 2 ? ? usa std 1 ?2
> 3 ? ? can con 1 ?3
> 4 ?france con 1 ?4
> 5 ? ? can std 1 ?5
> 6 ? italy con 1 ?6
> 7 ? ? usa std 2 ?7
> 8 ? ? usa std 2 ?8
> 9 ? ? can con 2 ?9
> 10 france con 2 10
> 11 ? ?can std 2 11
> 12 ?italy con 2 12
> 13 ? ?usa std 3 13
> 14 ? ?usa std 3 14
> 15 ? ?can con 3 15
> 16 france con 3 16
> 17 ? ?can std 3 17
> 18 ?italy con 3 18
> 19 ? ?usa std 4 19
> 20 ? ?usa std 4 20
> 21 ? ?can con 4 21
> 22 france con 4 22
> 23 ? ?can std 4 23
> 24 ?italy con 4 24
> 25 ? ?usa std 5 25
> 26 ? ?usa std 5 26
> 27 ? ?can con 5 27
> 28 france con 5 28
> 29 ? ?can std 5 29
> 30 ?italy con 5 30
> 31 ? ?usa std 6 31
> 32 ? ?usa std 6 32
> 33 ? ?can con 6 33
> 34 france con 6 34
> 35 ? ?can std 6 35
> 36 ?italy con 6 36
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>