Dear all, I have a table like this:> edsR.ID Region Gender Agegr Time nvisits 1 1 A F 60--64 1:00 1 2 2 O F 55--59 1:20 1 3 3 O F 55--59 3:45 3 4 4 S M 60--64 1:10 3 5 5 W F 55--59 12:30 1 6 6 W M 60--64 8:00 2 I got a bootstrap sample using the following code:> r<-sample(eds[,1],replace=TRUE)> r[1] 2 4 3 2 6 4> beds<-eds[r,]> bedsR.ID Region Gender Agegr Time nvisits 2 2 O F 55--59 1:20 1 4 4 S M 60--64 1:10 3 3 3 O F 55--59 3:45 3 2.1 2 O F 55--59 1:20 1 6 6 W M 60--64 8:00 2 4.1 4 S M 60--64 1:10 3 I want to sum the last column by columns 2,3,and 4(including 0 in some group). I tried the following codes: #1 : only get the freq, not the sum of the last column.> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))> tableVar1 Var2 Var3 Freq 1 A F 55--59 0 2 O F 55--59 3 3 S F 55--59 0 4 W F 55--59 0 5 A M 55--59 0 6 O M 55--59 0 7 S M 55--59 0 8 W M 55--59 0 9 A F 60--64 0 10 O F 60--64 0 11 S F 60--64 0 12 W F 60--64 0 13 A M 60--64 0 14 O M 60--64 0 15 S M 60--64 2 16 W M 60--64 1 # 2: only got the sum the last column, but miss the group with 0 counts.> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)Group.1 Group.2 Group.3 x 1 O F 55--59 5 2 S M 60--64 6 3 W M 60--64 2 In conclusion, the following is what I want: Var1 Var2 Var3 Freq 1 A F 55--59 0 2 O F 55--59 5 3 S F 55--59 0 4 W F 55--59 0 5 A M 55--59 0 6 O M 55--59 0 7 S M 55--59 0 8 W M 55--59 0 9 A F 60--64 0 10 O F 60--64 0 11 S F 60--64 0 12 W F 60--64 0 13 A M 60--64 0 14 O M 60--64 0 15 S M 60--64 6 16 W M 60--64 2 Does anyone know a code to do this or give a hint? Thank you in advance. Betty [[alternative HTML version deleted]]
Hi:
This is not an elegant solution by any means, but it gets what you
want...using
the data frame from your bootstrap sample,
# All combinations of the three factors
xx <- with(beds, expand.grid(Region = levels(Region), Gender levels(Gender),
Agegr = levels(Agegr)) )> dim(xx)
[1] 12 3 # differs from the 16, but bootstrapping
probably explains it...
# One way to get a summary (there are others...)
library(plyr)
yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits sum(nvisits))
res <- merge(xx, yy, all.x = TRUE)
res <- within(res, Nvisits[is.na(Nvisits)] <- 0)> res
Region Gender Agegr Nvisits
1 O F 55--59 5
2 O F 60--64 0
3 O M 55--59 0
4 O M 60--64 0
5 S F 55--59 0
6 S F 60--64 0
7 S M 55--59 0
8 S M 60--64 6
9 W F 55--59 0
10 W F 60--64 0
11 W M 55--59 0
12 W M 60--64 2
HTH,
Dennis
On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang
<fang.yang@ualberta.ca>wrote:
> Dear all,
>
>
>
> I have a table like this:
>
>
>
> > eds
>
> R.ID Region Gender Agegr Time nvisits
>
> 1 1 A F 60--64 1:00 1
>
> 2 2 O F 55--59 1:20 1
>
> 3 3 O F 55--59 3:45 3
>
> 4 4 S M 60--64 1:10 3
>
> 5 5 W F 55--59 12:30 1
>
> 6 6 W M 60--64 8:00 2
>
>
>
>
>
>
>
> I got a bootstrap sample using the following code:
>
>
>
> > r<-sample(eds[,1],replace=TRUE)
>
> > r
>
> [1] 2 4 3 2 6 4
>
> > beds<-eds[r,]
>
> > beds
>
> R.ID Region Gender Agegr Time nvisits
>
> 2 2 O F 55--59 1:20 1
>
> 4 4 S M 60--64 1:10 3
>
> 3 3 O F 55--59 3:45 3
>
> 2.1 2 O F 55--59 1:20 1
>
> 6 6 W M 60--64 8:00 2
>
> 4.1 4 S M 60--64 1:10 3
>
>
>
>
>
>
>
> I want to sum the last column by columns 2,3,and 4(including 0 in some
> group). I tried the following codes:
>
> #1 : only get the freq, not the sum of the last column.
>
> > table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))
>
> > table
>
> Var1 Var2 Var3 Freq
>
> 1 A F 55--59 0
>
> 2 O F 55--59 3
>
> 3 S F 55--59 0
>
> 4 W F 55--59 0
>
> 5 A M 55--59 0
>
> 6 O M 55--59 0
>
> 7 S M 55--59 0
>
> 8 W M 55--59 0
>
> 9 A F 60--64 0
>
> 10 O F 60--64 0
>
> 11 S F 60--64 0
>
> 12 W F 60--64 0
>
> 13 A M 60--64 0
>
> 14 O M 60--64 0
>
> 15 S M 60--64 2
>
> 16 W M 60--64 1
>
>
>
> # 2: only got the sum the last column, but miss the group with 0 counts.
>
> > aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)
>
> Group.1 Group.2 Group.3 x
>
> 1 O F 55--59 5
>
> 2 S M 60--64 6
>
> 3 W M 60--64 2
>
>
>
> In conclusion, the following is what I want:
>
>
>
> Var1 Var2 Var3 Freq
>
> 1 A F 55--59 0
>
> 2 O F 55--59 5
>
> 3 S F 55--59 0
>
> 4 W F 55--59 0
>
> 5 A M 55--59 0
>
> 6 O M 55--59 0
>
> 7 S M 55--59 0
>
> 8 W M 55--59 0
>
> 9 A F 60--64 0
>
> 10 O F 60--64 0
>
> 11 S F 60--64 0
>
> 12 W F 60--64 0
>
> 13 A M 60--64 0
>
> 14 O M 60--64 0
>
> 15 S M 60--64 6
>
> 16 W M 60--64 2
>
>
>
> Does anyone know a code to do this or give a hint? Thank you in advance.
>
>
>
> Betty
>
>
>
>
>
>
>
>
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Thanks for your help. Finally, I got it.
From: Dennis Murphy [mailto:djmuser@gmail.com]
Sent: Friday, February 05, 2010 12:20 PM
To: Fang (Betty) Yang
Cc: r-help@r-project.org
Subject: Re: [R] sum a particular column by group
Hi:
This is not an elegant solution by any means, but it gets what you
want...using
the data frame from your bootstrap sample,
# All combinations of the three factors
xx <- with(beds, expand.grid(Region = levels(Region), Gender levels(Gender),
Agegr = levels(Agegr)) )> dim(xx)
[1] 12 3 # differs from the 16, but bootstrapping
probably explains it...
# One way to get a summary (there are others...)
library(plyr)
yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits sum(nvisits))
res <- merge(xx, yy, all.x = TRUE)
res <- within(res, Nvisits[is.na(Nvisits)] <- 0)> res
Region Gender Agegr Nvisits
1 O F 55--59 5
2 O F 60--64 0
3 O M 55--59 0
4 O M 60--64 0
5 S F 55--59 0
6 S F 60--64 0
7 S M 55--59 0
8 S M 60--64 6
9 W F 55--59 0
10 W F 60--64 0
11 W M 55--59 0
12 W M 60--64 2
HTH,
Dennis
On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang <fang.yang@ualberta.ca>
wrote:
Dear all,
I have a table like this:
> eds
R.ID Region Gender Agegr Time nvisits
1 1 A F 60--64 1:00 1
2 2 O F 55--59 1:20 1
3 3 O F 55--59 3:45 3
4 4 S M 60--64 1:10 3
5 5 W F 55--59 12:30 1
6 6 W M 60--64 8:00 2
I got a bootstrap sample using the following code:
> r<-sample(eds[,1],replace=TRUE)
> r
[1] 2 4 3 2 6 4
> beds<-eds[r,]
> beds
R.ID Region Gender Agegr Time nvisits
2 2 O F 55--59 1:20 1
4 4 S M 60--64 1:10 3
3 3 O F 55--59 3:45 3
2.1 2 O F 55--59 1:20 1
6 6 W M 60--64 8:00 2
4.1 4 S M 60--64 1:10 3
I want to sum the last column by columns 2,3,and 4(including 0 in some
group). I tried the following codes:
#1 : only get the freq, not the sum of the last column.
> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))
> table
Var1 Var2 Var3 Freq
1 A F 55--59 0
2 O F 55--59 3
3 S F 55--59 0
4 W F 55--59 0
5 A M 55--59 0
6 O M 55--59 0
7 S M 55--59 0
8 W M 55--59 0
9 A F 60--64 0
10 O F 60--64 0
11 S F 60--64 0
12 W F 60--64 0
13 A M 60--64 0
14 O M 60--64 0
15 S M 60--64 2
16 W M 60--64 1
# 2: only got the sum the last column, but miss the group with 0 counts.
> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)
Group.1 Group.2 Group.3 x
1 O F 55--59 5
2 S M 60--64 6
3 W M 60--64 2
In conclusion, the following is what I want:
Var1 Var2 Var3 Freq
1 A F 55--59 0
2 O F 55--59 5
3 S F 55--59 0
4 W F 55--59 0
5 A M 55--59 0
6 O M 55--59 0
7 S M 55--59 0
8 W M 55--59 0
9 A F 60--64 0
10 O F 60--64 0
11 S F 60--64 0
12 W F 60--64 0
13 A M 60--64 0
14 O M 60--64 0
15 S M 60--64 6
16 W M 60--64 2
Does anyone know a code to do this or give a hint? Thank you in advance.
Betty
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]