Ray DiGiacomo, Jr.
2013-Jan-04 05:00 UTC
[R] "By" function Frame Conversion (with Multiple Indices)
Hello, I have the following dataset. Please note that there are missing values on records 4 and 5: id,age,weight,height,gender 1,22,180,72,m 2,13,100,67,f 3,5,40,40,f 4,6,42,,f 5,12,98,66, 6,50,255,60,m I'm using the "By" function like this: list1 <- by(dataset[c("weight", "height")], dataset[c("age", "gender")], colMeans, na.rm = TRUE) I then convert the list above to a frame like this: as.data.frame( do.call(rbind, list1) ) I get this output from the code above: weight height 1 40 40 2 42 NaN 3 100 67 4 180 72 5 255 60 I want to get the output above, but I also want two additional columns named "age" and "gender" (with the age and gender values from the "By" function output). How would I do this? Best Regards, Ray DiGiacomo, Jr. Healthcare Predictive Analytics Specialist President, Lion Data Systems LLC President, The Orange County R User Group Board Member, TDWI rayd@liondatasystems.com (m) 408-425-7851 San Juan Capistrano, California USA twitter.com/liondatasystems linkedin.com/in/raydigiacomojr youtube.com/user/liondatasystems/videos liondatasystems.com/courses [[alternative HTML version deleted]]
David Winsemius
2013-Jan-04 07:46 UTC
[R] "By" function Frame Conversion (with Multiple Indices)
On Jan 3, 2013, at 9:00 PM, Ray DiGiacomo, Jr. wrote:> Hello, > > I have the following dataset. Please note that there are missing > values on > records 4 and 5: > > id,age,weight,height,gender > 1,22,180,72,m > 2,13,100,67,f > 3,5,40,40,f > 4,6,42,,f > 5,12,98,66, > 6,50,255,60,m > > I'm using the "By" function like this: > > list1 <- by(dataset[c("weight", "height")], > dataset[c("age", "gender")], > colMeans, > na.rm = TRUE)I named the dataframe "dat" aggregate( dat[,c("weight", "height")], list( age=dat$age, gender=dat$gender), FUN=mean, na.rm=TRUE) age gender weight height 1 12 98 66 2 5 f 40 40 3 6 f 42 NaN 4 13 f 100 67 5 22 m 180 72 6 50 m 255 60 -- David.> > I then convert the list above to a frame like this: > > as.data.frame( do.call(rbind, list1) ) > > I get this output from the code above: > > weight height > 1 40 40 > 2 42 NaN > 3 100 67 > 4 180 72 > 5 255 60 > > I want to get the output above, but I also want two additional columns > named "age" and "gender" (with the age and gender values from the "By" > function output). > > How would I do this? > > Best Regards, > > Ray DiGiacomo, Jr. > Healthcare Predictive Analytics Specialist > President, Lion Data Systems LLC > President, The Orange County R User Group > Board Member, TDWI > rayd at liondatasystems.com > (m) 408-425-7851 > San Juan Capistrano, California USA > twitter.com/liondatasystems > linkedin.com/in/raydigiacomojr > youtube.com/user/liondatasystems/videos > liondatasystems.com/courses > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Alameda, CA, USA
Hi, You could try this: dat1<-read.table(text=" id,age,weight,height,gender 1,22,180,72,m 2,13,100,67,f 3,5,40,40,f 4,6,42,,f 5,12,98,66, 6,50,255,60,m ",sep=",",header=TRUE,stringsAsFactors=FALSE,na.strings="") list1<-by(dat1[c("weight","height")],dat1[c("age","gender")],colMeans,na.rm=TRUE,simplify=FALSE) ?list2<-split(dat1,list(dat1$age,dat1$gender)) names(list1)<-names(list2) res<-do.call(rbind,list1) res2<-cbind(read.table(text=row.names(res),sep=".",header=FALSE,stringsAsFactors=FALSE),res) ?colnames(res2)[1:2]<-c("age","gender") ?row.names(res2)<-1:nrow(res2) ?res2 #? age gender weight height #1?? 5????? f???? 40???? 40 #2?? 6????? f???? 42??? NaN #3? 13????? f??? 100???? 67 #4? 22????? m??? 180???? 72 #5? 50????? m??? 255???? 60 library(plyr) ddply(dat1,.(age,gender),colwise(mean,c("weight","height")),na.rm=TRUE) # age gender weight height #1?? 5????? f???? 40???? 40 #2?? 6????? f???? 42??? NaN #3? 12?? <NA>???? 98???? 66 #prints groups which are missing #4? 13????? f??? 100???? 67 #5? 22????? m??? 180???? 72 #6? 50????? m??? 255???? 60 A.K. ----- Original Message ----- From: "Ray DiGiacomo, Jr." <rayd at liondatasystems.com> To: R Help <r-help at r-project.org> Cc: Sent: Friday, January 4, 2013 12:00 AM Subject: [R] "By" function Frame Conversion (with Multiple Indices) Hello, I have the following dataset.? Please note that there are missing values on records 4 and 5: id,age,weight,height,gender 1,22,180,72,m 2,13,100,67,f 3,5,40,40,f 4,6,42,,f 5,12,98,66, 6,50,255,60,m I'm using the "By" function like this: list1 <- by(dataset[c("weight", "height")], ? ? ? ? ? ? dataset[c("age", "gender")], ? ? ? ? ? ? colMeans, ? ? ? ? ? ? ? ? ? ? ? ? ? na.rm = TRUE) I then convert the list above to a frame like this: as.data.frame( do.call(rbind, list1) ) I get this output from the code above: ? ? weight height 1? ? 40? ? 40 2? ? 42? ? NaN 3? ? 100? ? 67 4? ? 180? ? 72 5? ? 255? ? 60 I want to get the output above, but I also want two additional columns named "age" and "gender" (with the age and gender values from the "By" function output). How would I do this? Best Regards, Ray DiGiacomo, Jr. Healthcare Predictive Analytics Specialist President, Lion Data Systems LLC President, The Orange County R User Group Board Member, TDWI rayd at liondatasystems.com (m) 408-425-7851 San Juan Capistrano, California USA twitter.com/liondatasystems linkedin.com/in/raydigiacomojr youtube.com/user/liondatasystems/videos liondatasystems.com/courses ??? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.