Dear all, I have a table like this:> edsR.ID Region Gender Agegr Time nvisits 1 1 A F 60--64 1:00 1 2 2 O F 55--59 1:20 1 3 3 O F 55--59 3:45 3 4 4 S M 60--64 1:10 3 5 5 W F 55--59 12:30 1 6 6 W M 60--64 8:00 2 I got a bootstrap sample using the following code:> r<-sample(eds[,1],replace=TRUE)> r[1] 2 4 3 2 6 4> beds<-eds[r,]> bedsR.ID Region Gender Agegr Time nvisits 2 2 O F 55--59 1:20 1 4 4 S M 60--64 1:10 3 3 3 O F 55--59 3:45 3 2.1 2 O F 55--59 1:20 1 6 6 W M 60--64 8:00 2 4.1 4 S M 60--64 1:10 3 I want to sum the last column by columns 2,3,and 4(including 0 in some group). I tried the following codes: #1 : only get the freq, not the sum of the last column.> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))> tableVar1 Var2 Var3 Freq 1 A F 55--59 0 2 O F 55--59 3 3 S F 55--59 0 4 W F 55--59 0 5 A M 55--59 0 6 O M 55--59 0 7 S M 55--59 0 8 W M 55--59 0 9 A F 60--64 0 10 O F 60--64 0 11 S F 60--64 0 12 W F 60--64 0 13 A M 60--64 0 14 O M 60--64 0 15 S M 60--64 2 16 W M 60--64 1 # 2: only got the sum the last column, but miss the group with 0 counts.> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)Group.1 Group.2 Group.3 x 1 O F 55--59 5 2 S M 60--64 6 3 W M 60--64 2 In conclusion, the following is what I want: Var1 Var2 Var3 Freq 1 A F 55--59 0 2 O F 55--59 5 3 S F 55--59 0 4 W F 55--59 0 5 A M 55--59 0 6 O M 55--59 0 7 S M 55--59 0 8 W M 55--59 0 9 A F 60--64 0 10 O F 60--64 0 11 S F 60--64 0 12 W F 60--64 0 13 A M 60--64 0 14 O M 60--64 0 15 S M 60--64 6 16 W M 60--64 2 Does anyone know a code to do this or give a hint? Thank you in advance. Betty [[alternative HTML version deleted]]
Hi: This is not an elegant solution by any means, but it gets what you want...using the data frame from your bootstrap sample, # All combinations of the three factors xx <- with(beds, expand.grid(Region = levels(Region), Gender levels(Gender), Agegr = levels(Agegr)) )> dim(xx)[1] 12 3 # differs from the 16, but bootstrapping probably explains it... # One way to get a summary (there are others...) library(plyr) yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits sum(nvisits)) res <- merge(xx, yy, all.x = TRUE) res <- within(res, Nvisits[is.na(Nvisits)] <- 0)> resRegion Gender Agegr Nvisits 1 O F 55--59 5 2 O F 60--64 0 3 O M 55--59 0 4 O M 60--64 0 5 S F 55--59 0 6 S F 60--64 0 7 S M 55--59 0 8 S M 60--64 6 9 W F 55--59 0 10 W F 60--64 0 11 W M 55--59 0 12 W M 60--64 2 HTH, Dennis On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang <fang.yang@ualberta.ca>wrote:> Dear all, > > > > I have a table like this: > > > > > eds > > R.ID Region Gender Agegr Time nvisits > > 1 1 A F 60--64 1:00 1 > > 2 2 O F 55--59 1:20 1 > > 3 3 O F 55--59 3:45 3 > > 4 4 S M 60--64 1:10 3 > > 5 5 W F 55--59 12:30 1 > > 6 6 W M 60--64 8:00 2 > > > > > > > > I got a bootstrap sample using the following code: > > > > > r<-sample(eds[,1],replace=TRUE) > > > r > > [1] 2 4 3 2 6 4 > > > beds<-eds[r,] > > > beds > > R.ID Region Gender Agegr Time nvisits > > 2 2 O F 55--59 1:20 1 > > 4 4 S M 60--64 1:10 3 > > 3 3 O F 55--59 3:45 3 > > 2.1 2 O F 55--59 1:20 1 > > 6 6 W M 60--64 8:00 2 > > 4.1 4 S M 60--64 1:10 3 > > > > > > > > I want to sum the last column by columns 2,3,and 4(including 0 in some > group). I tried the following codes: > > #1 : only get the freq, not the sum of the last column. > > > table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4]))) > > > table > > Var1 Var2 Var3 Freq > > 1 A F 55--59 0 > > 2 O F 55--59 3 > > 3 S F 55--59 0 > > 4 W F 55--59 0 > > 5 A M 55--59 0 > > 6 O M 55--59 0 > > 7 S M 55--59 0 > > 8 W M 55--59 0 > > 9 A F 60--64 0 > > 10 O F 60--64 0 > > 11 S F 60--64 0 > > 12 W F 60--64 0 > > 13 A M 60--64 0 > > 14 O M 60--64 0 > > 15 S M 60--64 2 > > 16 W M 60--64 1 > > > > # 2: only got the sum the last column, but miss the group with 0 counts. > > > aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum) > > Group.1 Group.2 Group.3 x > > 1 O F 55--59 5 > > 2 S M 60--64 6 > > 3 W M 60--64 2 > > > > In conclusion, the following is what I want: > > > > Var1 Var2 Var3 Freq > > 1 A F 55--59 0 > > 2 O F 55--59 5 > > 3 S F 55--59 0 > > 4 W F 55--59 0 > > 5 A M 55--59 0 > > 6 O M 55--59 0 > > 7 S M 55--59 0 > > 8 W M 55--59 0 > > 9 A F 60--64 0 > > 10 O F 60--64 0 > > 11 S F 60--64 0 > > 12 W F 60--64 0 > > 13 A M 60--64 0 > > 14 O M 60--64 0 > > 15 S M 60--64 6 > > 16 W M 60--64 2 > > > > Does anyone know a code to do this or give a hint? Thank you in advance. > > > > Betty > > > > > > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Thanks for your help. Finally, I got it. From: Dennis Murphy [mailto:djmuser@gmail.com] Sent: Friday, February 05, 2010 12:20 PM To: Fang (Betty) Yang Cc: r-help@r-project.org Subject: Re: [R] sum a particular column by group Hi: This is not an elegant solution by any means, but it gets what you want...using the data frame from your bootstrap sample, # All combinations of the three factors xx <- with(beds, expand.grid(Region = levels(Region), Gender levels(Gender), Agegr = levels(Agegr)) )> dim(xx)[1] 12 3 # differs from the 16, but bootstrapping probably explains it... # One way to get a summary (there are others...) library(plyr) yy <- ddply(beds, .(Region, Gender, Agegr), summarise, Nvisits sum(nvisits)) res <- merge(xx, yy, all.x = TRUE) res <- within(res, Nvisits[is.na(Nvisits)] <- 0)> resRegion Gender Agegr Nvisits 1 O F 55--59 5 2 O F 60--64 0 3 O M 55--59 0 4 O M 60--64 0 5 S F 55--59 0 6 S F 60--64 0 7 S M 55--59 0 8 S M 60--64 6 9 W F 55--59 0 10 W F 60--64 0 11 W M 55--59 0 12 W M 60--64 2 HTH, Dennis On Fri, Feb 5, 2010 at 9:20 AM, Fang (Betty) Yang <fang.yang@ualberta.ca> wrote: Dear all, I have a table like this:> edsR.ID Region Gender Agegr Time nvisits 1 1 A F 60--64 1:00 1 2 2 O F 55--59 1:20 1 3 3 O F 55--59 3:45 3 4 4 S M 60--64 1:10 3 5 5 W F 55--59 12:30 1 6 6 W M 60--64 8:00 2 I got a bootstrap sample using the following code:> r<-sample(eds[,1],replace=TRUE)> r[1] 2 4 3 2 6 4> beds<-eds[r,]> bedsR.ID Region Gender Agegr Time nvisits 2 2 O F 55--59 1:20 1 4 4 S M 60--64 1:10 3 3 3 O F 55--59 3:45 3 2.1 2 O F 55--59 1:20 1 6 6 W M 60--64 8:00 2 4.1 4 S M 60--64 1:10 3 I want to sum the last column by columns 2,3,and 4(including 0 in some group). I tried the following codes: #1 : only get the freq, not the sum of the last column.> table<-as.data.frame(with(beds,table(beds[,2],beds[,3],beds[,4])))> tableVar1 Var2 Var3 Freq 1 A F 55--59 0 2 O F 55--59 3 3 S F 55--59 0 4 W F 55--59 0 5 A M 55--59 0 6 O M 55--59 0 7 S M 55--59 0 8 W M 55--59 0 9 A F 60--64 0 10 O F 60--64 0 11 S F 60--64 0 12 W F 60--64 0 13 A M 60--64 0 14 O M 60--64 0 15 S M 60--64 2 16 W M 60--64 1 # 2: only got the sum the last column, but miss the group with 0 counts.> aggregate(beds[,6],list(beds[,2],beds[,3],beds[,4]),sum)Group.1 Group.2 Group.3 x 1 O F 55--59 5 2 S M 60--64 6 3 W M 60--64 2 In conclusion, the following is what I want: Var1 Var2 Var3 Freq 1 A F 55--59 0 2 O F 55--59 5 3 S F 55--59 0 4 W F 55--59 0 5 A M 55--59 0 6 O M 55--59 0 7 S M 55--59 0 8 W M 55--59 0 9 A F 60--64 0 10 O F 60--64 0 11 S F 60--64 0 12 W F 60--64 0 13 A M 60--64 0 14 O M 60--64 0 15 S M 60--64 6 16 W M 60--64 2 Does anyone know a code to do this or give a hint? Thank you in advance. Betty [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]]