thr3ads.net - R help - [R] sum a particular column by group [Feb 2010]

If this information is useful, please help other people find it:
Share via:

Fang (Betty) Yang

2010-Feb-05 17:20 UTC

[R] sum a particular column by group

Dear all,

 

I have a table like this:

 
> eds
  R.ID Region Gender  Agegr  Time nvisits

1    1      A             F          60--64   1:00       1

2    2      O            F          55--59    1:20       1

3    3      O            F           55--59   3:45       3

4    4      S            M         60--64    1:10       3

5    5      W          F           55--59   12:30       1

6    6      W          M          60--64   8:00       2

 

 

 

I got a bootstrap sample using the following code:

 
> r<-sample(eds[,1],replace=TRUE)
> r
[1] 2 4 3 2 6 4
> beds<-eds[r,]
> beds
    R.ID Region Gender  Agegr Time nvisits

2      2      O             F          55--59   1:20       1

4      4      S              M        60--64   1:10       3

3      3      O             F          55--59   3:45       3

2.1    2      O             F         55--59   1:20       1

6      6      W            M         60--64   8:00       2

4.1    4      S            M         60--64   1:10       3

 

 

 

I want to sum the last column by columns 2,3,and 4(including 0 in some
group).  I tried the following codes:

#1 : only get the freq, not the sum of the last column.
> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))
> table
   Var1 Var2   Var3 Freq

1     A    F 55--59    0

2     O    F 55--59    3

3     S    F 55--59    0

4     W    F 55--59    0

5     A    M 55--59    0

6     O    M 55--59    0

7     S    M 55--59    0

8     W    M 55--59    0

9     A    F 60--64    0

10    O    F 60--64    0

11    S    F 60--64    0

12    W    F 60--64    0

13    A    M 60--64    0

14    O    M 60--64    0

15    S    M 60--64    2

16    W    M 60--64    1

 

# 2: only got the sum the last column, but miss the group with 0 counts.
> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)
  Group.1 Group.2 Group.3 x

1       O       F  55--59 5

2       S       M  60--64 6

3       W       M  60--64 2

 

In conclusion, the following is what I want:

 

   Var1 Var2   Var3 Freq

1     A    F 55--59    0

2     O    F 55--59    5

3     S    F 55--59    0

4     W    F 55--59    0

5     A    M 55--59    0

6     O    M 55--59    0

7     S    M 55--59    0

8     W    M 55--59    0

9     A    F 60--64    0

10    O    F 60--64    0

11    S    F 60--64    0

12    W    F 60--64    0

13    A    M 60--64    0

14    O    M 60--64    0

15    S    M 60--64    6

16    W    M 60--64    2

 

Does anyone know a code to do this or give a hint? Thank you in advance.

 

Betty

 

 

 

 


	[[alternative HTML version deleted]]

Dennis Murphy

2010-Feb-05 19:19 UTC

head link

[R] sum a particular column by group

Hi:

This is not an elegant solution by any means, but it gets what you
want...using
the data frame from your bootstrap sample,

# All combinations of the three factors
xx <- with(beds, expand.grid(Region = levels(Region), Gender levels(Gender),
               Agegr = levels(Agegr)) )> dim(xx)[1] 12  3                    # differs from the 16, but bootstrapping
probably explains it...
# One way to get a summary (there are others...)
library(plyr)
yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits sum(nvisits))
res <- merge(xx, yy, all.x = TRUE)
res <- within(res, Nvisits[is.na(Nvisits)] <- 0)> res   Region Gender  Agegr Nvisits
1       O      F 55--59       5
2       O      F 60--64       0
3       O      M 55--59       0
4       O      M 60--64       0
5       S      F 55--59       0
6       S      F 60--64       0
7       S      M 55--59       0
8       S      M 60--64       6
9       W      F 55--59       0
10      W      F 60--64       0
11      W      M 55--59       0
12      W      M 60--64       2


HTH,
Dennis
On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang
<fang.yang@ualberta.ca>wrote:
> Dear all,
>
>
>
> I have a table like this:
>
>
>
> > eds
>
>  R.ID Region Gender  Agegr  Time nvisits
>
> 1    1      A             F          60--64   1:00       1
>
> 2    2      O            F          55--59    1:20       1
>
> 3    3      O            F           55--59   3:45       3
>
> 4    4      S            M         60--64    1:10       3
>
> 5    5      W          F           55--59   12:30       1
>
> 6    6      W          M          60--64   8:00       2
>
>
>
>
>
>
>
> I got a bootstrap sample using the following code:
>
>
>
> > r<-sample(eds[,1],replace=TRUE)
>
> > r
>
> [1] 2 4 3 2 6 4
>
> > beds<-eds[r,]
>
> > beds
>
>    R.ID Region Gender  Agegr Time nvisits
>
> 2      2      O             F          55--59   1:20       1
>
> 4      4      S              M        60--64   1:10       3
>
> 3      3      O             F          55--59   3:45       3
>
> 2.1    2      O             F         55--59   1:20       1
>
> 6      6      W            M         60--64   8:00       2
>
> 4.1    4      S            M         60--64   1:10       3
>
>
>
>
>
>
>
> I want to sum the last column by columns 2,3,and 4(including 0 in some
> group).  I tried the following codes:
>
> #1 : only get the freq, not the sum of the last column.
>
> > table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))
>
> > table
>
>   Var1 Var2   Var3 Freq
>
> 1     A    F 55--59    0
>
> 2     O    F 55--59    3
>
> 3     S    F 55--59    0
>
> 4     W    F 55--59    0
>
> 5     A    M 55--59    0
>
> 6     O    M 55--59    0
>
> 7     S    M 55--59    0
>
> 8     W    M 55--59    0
>
> 9     A    F 60--64    0
>
> 10    O    F 60--64    0
>
> 11    S    F 60--64    0
>
> 12    W    F 60--64    0
>
> 13    A    M 60--64    0
>
> 14    O    M 60--64    0
>
> 15    S    M 60--64    2
>
> 16    W    M 60--64    1
>
>
>
> # 2: only got the sum the last column, but miss the group with 0 counts.
>
> > aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)
>
>  Group.1 Group.2 Group.3 x
>
> 1       O       F  55--59 5
>
> 2       S       M  60--64 6
>
> 3       W       M  60--64 2
>
>
>
> In conclusion, the following is what I want:
>
>
>
>   Var1 Var2   Var3 Freq
>
> 1     A    F 55--59    0
>
> 2     O    F 55--59    5
>
> 3     S    F 55--59    0
>
> 4     W    F 55--59    0
>
> 5     A    M 55--59    0
>
> 6     O    M 55--59    0
>
> 7     S    M 55--59    0
>
> 8     W    M 55--59    0
>
> 9     A    F 60--64    0
>
> 10    O    F 60--64    0
>
> 11    S    F 60--64    0
>
> 12    W    F 60--64    0
>
> 13    A    M 60--64    0
>
> 14    O    M 60--64    0
>
> 15    S    M 60--64    6
>
> 16    W    M 60--64    2
>
>
>
> Does anyone know a code to do this or give a hint? Thank you in advance.
>
>
>
> Betty
>
>
>
>
>
>
>
>
>
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Fang (Betty) Yang

2010-Feb-05 20:16 UTC

head link

[R] sum a particular column by group

Thanks for your help. Finally, I got it.

 

From: Dennis Murphy [mailto:djmuser@gmail.com] 
Sent: Friday, February 05, 2010 12:20 PM
To: Fang (Betty) Yang
Cc: r-help@r-project.org
Subject: Re: [R] sum a particular column by group

 

Hi:

This is not an elegant solution by any means, but it gets what you
want...using
the data frame from your bootstrap sample,

# All combinations of the three factors
xx <- with(beds, expand.grid(Region = levels(Region), Gender levels(Gender), 
               Agegr = levels(Agegr)) )> dim(xx)[1] 12  3                    # differs from the 16, but bootstrapping
probably explains it...
# One way to get a summary (there are others...)
library(plyr)
yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits sum(nvisits))
res <- merge(xx, yy, all.x = TRUE)
res <- within(res, Nvisits[is.na(Nvisits)] <- 0)> res   Region Gender  Agegr Nvisits
1       O      F 55--59       5
2       O      F 60--64       0
3       O      M 55--59       0
4       O      M 60--64       0
5       S      F 55--59       0
6       S      F 60--64       0
7       S      M 55--59       0
8       S      M 60--64       6
9       W      F 55--59       0
10      W      F 60--64       0
11      W      M 55--59       0
12      W      M 60--64       2


HTH,
Dennis

On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang <fang.yang@ualberta.ca>
wrote:

Dear all,



I have a table like this:


> eds
 R.ID Region Gender  Agegr  Time nvisits

1    1      A             F          60--64   1:00       1

2    2      O            F          55--59    1:20       1

3    3      O            F           55--59   3:45       3

4    4      S            M         60--64    1:10       3

5    5      W          F           55--59   12:30       1

6    6      W          M          60--64   8:00       2







I got a bootstrap sample using the following code:


> r<-sample(eds[,1],replace=TRUE)
> r
[1] 2 4 3 2 6 4
> beds<-eds[r,]
> beds
   R.ID Region Gender  Agegr Time nvisits

2      2      O             F          55--59   1:20       1

4      4      S              M        60--64   1:10       3

3      3      O             F          55--59   3:45       3

2.1    2      O             F         55--59   1:20       1

6      6      W            M         60--64   8:00       2

4.1    4      S            M         60--64   1:10       3







I want to sum the last column by columns 2,3,and 4(including 0 in some
group).  I tried the following codes:

#1 : only get the freq, not the sum of the last column.
> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))
> table
  Var1 Var2   Var3 Freq

1     A    F 55--59    0

2     O    F 55--59    3

3     S    F 55--59    0

4     W    F 55--59    0

5     A    M 55--59    0

6     O    M 55--59    0

7     S    M 55--59    0

8     W    M 55--59    0

9     A    F 60--64    0

10    O    F 60--64    0

11    S    F 60--64    0

12    W    F 60--64    0

13    A    M 60--64    0

14    O    M 60--64    0

15    S    M 60--64    2

16    W    M 60--64    1



# 2: only got the sum the last column, but miss the group with 0 counts.
> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)
 Group.1 Group.2 Group.3 x

1       O       F  55--59 5

2       S       M  60--64 6

3       W       M  60--64 2



In conclusion, the following is what I want:



  Var1 Var2   Var3 Freq

1     A    F 55--59    0

2     O    F 55--59    5

3     S    F 55--59    0

4     W    F 55--59    0

5     A    M 55--59    0

6     O    M 55--59    0

7     S    M 55--59    0

8     W    M 55--59    0

9     A    F 60--64    0

10    O    F 60--64    0

11    S    F 60--64    0

12    W    F 60--64    0

13    A    M 60--64    0

14    O    M 60--64    0

15    S    M 60--64    6

16    W    M 60--64    2



Does anyone know a code to do this or give a hint? Thank you in advance.



Betty










       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 


	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Feb 2010 - sum a particular column by group

[R] sum a particular column by group

[R] sum a particular column by group

[R] sum a particular column by group

Possibly Parallel Threads