thr3ads.net - R help - [R] Not sure how to use aggregate, colSums, by [Aug 2011]

If this information is useful, please help other people find it:
Share via:

eric

2011-Aug-14 17:20 UTC

[R] Not sure how to use aggregate, colSums, by

I have a data frame called test shown below that i would like to summarize in
a particular way :

I want to show the column sums (columns y ,f) grouped by country (column
e1). However, I'm looking for the data to be split according to column e2.
In other words, two tables of sum by country. One table for "con" and
one
table for "std" shown in column e2. Finally at the bottom of the two
tables,
I would like the overall sum /Totals for all the countries for the two
columns (y,f).  The lay outs for the two tables I'm looking for are also
shown below in case my description isn't completely clear

I would also like to be able to use the Totals of y and f for the two tables
in other calculations. 

I can get the two sets of totals with the following commands but not the
sums by country.

colSums(test[test$e2=="std", c(3,4)])
colSums(test[test$e2=="con", c(3,4)])

I know there's an easy way to do this with a combination of colSums, by,
aggregate but I can't seem to get it.

std         y       f

usa        sum   sum
france    sum   sum
can        sum   sum
italy       sum   sum
Totals    sum   sum

con       y       f

usa       sum   sum
france   sum   sum
can       sum   sum
italy      sum   sum
Totals    sum   sum

      e1  e2 y  f
1     usa std 1  1
2     usa std 1  2
3     can con 1  3
4  france con 1  4
5     can std 1  5
6   italy con 1  6
7     usa std 2  7
8     usa std 2  8
9     can con 2  9
10 france con 2 10
11    can std 2 11
12  italy con 2 12
13    usa std 3 13
14    usa std 3 14
15    can con 3 15
16 france con 3 16
17    can std 3 17
18  italy con 3 18
19    usa std 4 19
20    usa std 4 20
21    can con 4 21
22 france con 4 22
23    can std 4 23
24  italy con 4 24
25    usa std 5 25
26    usa std 5 26
27    can con 5 27
28 france con 5 28
29    can std 5 29
30  italy con 5 30
31    usa std 6 31
32    usa std 6 32
33    can con 6 33
34 france con 6 34
35    can std 6 35
36  italy con 6 36

--
View this message in context:
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
Sent from the R help mailing list archive at Nabble.com.

Mikhail Titov

2011-Aug-14 21:58 UTC

head link

[R] Not sure how to use aggregate, colSums, by

I hope this will help you get going

b <- sapply(unique(test$e2), function(x) {
    out <- aggregate(cbind(y,f)~e1, subset(test, e2==x),"sum")
    out <- rbind(out, data.frame(e1="total", y=sum(out$y),
f=sum(out$f)))
    out <- list(out)
    names(out) <- x
    out
})

> b$std
     e1  y   f
1   can 21 120
2   usa 42 198
3 total 63 318

$con
      e1  y   f
1    can 21 108
2 france 21 114
3  italy 21 126
4  total 63 348


On 08/14/2011 12:20 PM, eric wrote:> I have a data frame called test shown below that i would like to summarize
in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column
e2.
> In other words, two tables of sum by country. One table for "con"
and one
> table for "std" shown in column e2. Finally at the bottom of the
two tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f).  The lay outs for the two tables I'm looking for are
also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
tables
> in other calculations. 
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums,
by,
> aggregate but I can't seem to get it.
>
> std         y       f
>
> usa        sum   sum
> france    sum   sum
> can        sum   sum
> italy       sum   sum
> Totals    sum   sum
>
> con       y       f
>
> usa       sum   sum
> france   sum   sum
> can       sum   sum
> italy      sum   sum
> Totals    sum   sum
>
>       e1  e2 y  f
> 1     usa std 1  1
> 2     usa std 1  2
> 3     can con 1  3
> 4  france con 1  4
> 5     can std 1  5
> 6   italy con 1  6
> 7     usa std 2  7
> 8     usa std 2  8
> 9     can con 2  9
> 10 france con 2 10
> 11    can std 2 11
> 12  italy con 2 12
> 13    usa std 3 13
> 14    usa std 3 14
> 15    can con 3 15
> 16 france con 3 16
> 17    can std 3 17
> 18  italy con 3 18
> 19    usa std 4 19
> 20    usa std 4 20
> 21    can con 4 21
> 22 france con 4 22
> 23    can std 4 23
> 24  italy con 4 24
> 25    usa std 5 25
> 26    usa std 5 26
> 27    can con 5 27
> 28 france con 5 28
> 29    can std 5 29
> 30  italy con 5 30
> 31    usa std 6 31
> 32    usa std 6 32
> 33    can con 6 33
> 34 france con 6 34
> 35    can std 6 35
> 36  italy con 6 36
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Jorge Ivan Velez

2011-Aug-14 22:05 UTC

head link

[R] Not sure how to use aggregate, colSums, by

Hi eric,

Try

lapply(with(x, split(x, e2)), function(l){
r <- with(l, aggregate(list(y, f), list(e1), sum))
colnames(r) <- c('e1', 'y', 'f')
 r
   })

HTH,
Jorge


On Sun, Aug 14, 2011 at 1:20 PM, eric <> wrote:
> I have a data frame called test shown below that i would like to summarize
> in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column
e2.
> In other words, two tables of sum by country. One table for "con"
and one
> table for "std" shown in column e2. Finally at the bottom of the
two
> tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f).  The lay outs for the two tables I'm looking for are
also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
> tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums,
by,
> aggregate but I can't seem to get it.
>
> std         y       f
>
> usa        sum   sum
> france    sum   sum
> can        sum   sum
> italy       sum   sum
> Totals    sum   sum
>
> con       y       f
>
> usa       sum   sum
> france   sum   sum
> can       sum   sum
> italy      sum   sum
> Totals    sum   sum
>
>      e1  e2 y  f
> 1     usa std 1  1
> 2     usa std 1  2
> 3     can con 1  3
> 4  france con 1  4
> 5     can std 1  5
> 6   italy con 1  6
> 7     usa std 2  7
> 8     usa std 2  8
> 9     can con 2  9
> 10 france con 2 10
> 11    can std 2 11
> 12  italy con 2 12
> 13    usa std 3 13
> 14    usa std 3 14
> 15    can con 3 15
> 16 france con 3 16
> 17    can std 3 17
> 18  italy con 3 18
> 19    usa std 4 19
> 20    usa std 4 20
> 21    can con 4 21
> 22 france con 4 22
> 23    can std 4 23
> 24  italy con 4 24
> 25    usa std 5 25
> 26    usa std 5 26
> 27    can con 5 27
> 28 france con 5 28
> 29    can std 5 29
> 30  italy con 5 30
> 31    usa std 6 31
> 32    usa std 6 32
> 33    can con 6 33
> 34 france con 6 34
> 35    can std 6 35
> 36  italy con 6 36
>
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Dennis Murphy

2011-Aug-14 23:38 UTC

head link

[R] Not sure how to use aggregate, colSums, by

Hi:

Here's another approach using the reshape2 package. I called your data
frame dat in the code below.

library('reshape2')
mdat <- melt(dat, measure = c('y', 'f'))
acast(mdat, e1 ~ variable ~ e2, fun = sum, margins = 'e1')
, , con

        y   f
can    21 108
france 21 114
italy  21 126
usa     0   0
(all)  63 348

, , std

        y   f
can    21 120
france  0   0
italy   0   0
usa    42 198
(all)  63 318

This returns an array rather than a list.

HTH,
Dennis

On Sun, Aug 14, 2011 at 10:20 AM, eric <ericstrom at aol.com>
wrote:> I have a data frame called test shown below that i would like to summarize
in
> a particular way :
>
> I want to show the column sums (columns y ,f) grouped by country (column
> e1). However, I'm looking for the data to be split according to column
e2.
> In other words, two tables of sum by country. One table for "con"
and one
> table for "std" shown in column e2. Finally at the bottom of the
two tables,
> I would like the overall sum /Totals for all the countries for the two
> columns (y,f). ?The lay outs for the two tables I'm looking for are
also
> shown below in case my description isn't completely clear
>
> I would also like to be able to use the Totals of y and f for the two
tables
> in other calculations.
>
> I can get the two sets of totals with the following commands but not the
> sums by country.
>
> colSums(test[test$e2=="std", c(3,4)])
> colSums(test[test$e2=="con", c(3,4)])
>
> I know there's an easy way to do this with a combination of colSums,
by,
> aggregate but I can't seem to get it.
>
> std ? ? ? ? y ? ? ? f
>
> usa ? ? ? ?sum ? sum
> france ? ?sum ? sum
> can ? ? ? ?sum ? sum
> italy ? ? ? sum ? sum
> Totals ? ?sum ? sum
>
> con ? ? ? y ? ? ? f
>
> usa ? ? ? sum ? sum
> france ? sum ? sum
> can ? ? ? sum ? sum
> italy ? ? ?sum ? sum
> Totals ? ?sum ? sum
>
> ? ? ?e1 ?e2 y ?f
> 1 ? ? usa std 1 ?1
> 2 ? ? usa std 1 ?2
> 3 ? ? can con 1 ?3
> 4 ?france con 1 ?4
> 5 ? ? can std 1 ?5
> 6 ? italy con 1 ?6
> 7 ? ? usa std 2 ?7
> 8 ? ? usa std 2 ?8
> 9 ? ? can con 2 ?9
> 10 france con 2 10
> 11 ? ?can std 2 11
> 12 ?italy con 2 12
> 13 ? ?usa std 3 13
> 14 ? ?usa std 3 14
> 15 ? ?can con 3 15
> 16 france con 3 16
> 17 ? ?can std 3 17
> 18 ?italy con 3 18
> 19 ? ?usa std 4 19
> 20 ? ?usa std 4 20
> 21 ? ?can con 4 21
> 22 france con 4 22
> 23 ? ?can std 4 23
> 24 ?italy con 4 24
> 25 ? ?usa std 5 25
> 26 ? ?usa std 5 26
> 27 ? ?can con 5 27
> 28 france con 5 28
> 29 ? ?can std 5 29
> 30 ?italy con 5 30
> 31 ? ?usa std 6 31
> 32 ? ?usa std 6 32
> 33 ? ?can con 6 33
> 34 france con 6 34
> 35 ? ?can std 6 35
> 36 ?italy con 6 36
>
> --
> View this message in context:
http://r.789695.n4.nabble.com/Not-sure-how-to-use-aggregate-colSums-by-tp3743258p3743258.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Reasonably Related Threads

Search for more apparently analagous threads

R help - Aug 2011 - Not sure how to use aggregate, colSums, by

[R] Not sure how to use aggregate, colSums, by

[R] Not sure how to use aggregate, colSums, by

[R] Not sure how to use aggregate, colSums, by

[R] Not sure how to use aggregate, colSums, by

Reasonably Related Threads