Dear All: I am trying to calculate four columns' means in a dataframe like this: FID MID IID EW_INCU EW_17.5 EMW EEratio 1 4621 TWF2H5 45.26 NA 15.61 NA 1 4621 TWF2H6 48.02 44.09 13.41 0.3041506 2 4630 TWF2H19 51.44 47.81 NA NA 2 4631 TWF2H21 NA 52.72 16.70 0.3167678 2 4632 TWF2H22 55.70 50.45 16.48 0.3266601 2 4633 TWF2H23 44.42 40.89 12.96 0.3169479 I try this code> aggregate(df[,4:7],df[,1],mean)But I couldn't set the agrument na.rm=T in the mean() function,so the results are all NAs Please tell me how to handle NA values in the use of aggregate() Thanks a lot Yao He ????????????????????????? Master candidate in 2rd year Department of Animal genetics & breeding Room 436,College of Animial Science&Technology, China Agriculture University,Beijing,100193 E-mail: yao.h.1988 at gmail.com ??????????????????????????
HI, Try this: df1<-read.table(text=" FID? MID??? IID??????? EW_INCU EW_17.5? EMW??????? EEratio 1? 4621? TWF2H5??? 45.26??????? NA??????????? 15.61??????? NA 1? 4621? TWF2H6??? 48.02??????? 44.09??????? 13.41????? 0.3041506 2? 4630? TWF2H19? 51.44????? 47.81??????? NA??????????? NA 2? 4631? TWF2H21? NA????????? 52.72??????? 16.70????? 0.3167678 2? 4632? TWF2H22? 55.70????? 50.45??????? 16.48????? 0.3266601 2? 4633? TWF2H23? 44.42????? 40.89??????? 12.96????? 0.3169479 ",sep="",header=TRUE,stringsAsFactors=FALSE) ? aggregate(df1[,4:7],by=list(df1[,1]),function(x) mean(x,na.rm=T)) #? Group.1 EW_INCU EW_17.5? EMW EEratio #1?????? 1??? 46.6??? 44.1 14.5?? 0.304 #2?????? 2??? 50.5??? 48.0 15.4?? 0.320 ----- Original Message ----- From: Yao He <yao.h.1988 at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, December 15, 2012 10:44 PM Subject: [R] how to handle NA values in aggregate() Dear All: I am trying to calculate four columns' means in a dataframe like this: FID? MID? ? IID? ? ? ? EW_INCU EW_17.5? EMW? ? ? ? EEratio 1? 4621? TWF2H5? ? 45.26? ? ? ? NA? ? ? ? ? ? 15.61? ? ? ? NA 1? 4621? TWF2H6? ? 48.02? ? ? ? 44.09? ? ? ? 13.41? ? ? 0.3041506 2? 4630? TWF2H19? 51.44? ? ? 47.81? ? ? ? NA? ? ? ? ? ? NA 2? 4631? TWF2H21? NA? ? ? ? ? 52.72? ? ? ? 16.70? ? ? 0.3167678 2? 4632? TWF2H22? 55.70? ? ? 50.45? ? ? ? 16.48? ? ? 0.3266601 2? 4633? TWF2H23? 44.42? ? ? 40.89? ? ? ? 12.96? ? ? 0.3169479 I try this code> aggregate(df[,4:7],df[,1],mean)But I couldn't set the agrument na.rm=T in the mean() function,so the results are all NAs Please tell me how to handle NA values in the use of aggregate() Thanks a lot Yao He ????????????????????????? Master candidate in 2rd year Department of Animal genetics & breeding Room 436,College of Animial Science&Technology, China Agriculture University,Beijing,100193 E-mail: yao.h.1988 at gmail.com ?????????????????????????? ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
check the help file. ?aggregate says that it ignores missing values by default ;) df <- read.table( header = TRUE , text = "FID MID IID EW_INCU EW_17.5 EMW EEratio 1 4621 TWF2H5 45.26 NA 15.61 NA 1 4621 TWF2H6 48.02 44.09 13.41 0.3041506 2 4630 TWF2H19 51.44 47.81 NA NA 2 4631 TWF2H21 NA 52.72 16.70 0.3167678 2 4632 TWF2H22 55.70 50.45 16.48 0.3266601 2 4633 TWF2H23 44.42 40.89 12.96 0.3169479" ) aggregate( . ~ FID , data = df , mean ) # na.rm would need to be passed to tapply, which is a similar function tapply( df[ , 4 ] , df[ , 1 ] , mean ) tapply( df[ , 4 ] , df[ , 1 ] , mean , na.rm = TRUE ) [[alternative HTML version deleted]]
HI, This should also work: df1<-read.table(text=" FID? MID??? IID??????? EW_INCU EW_17.5? EMW??????? EEratio 1? 4621? TWF2H5??? 45.26??????? NA??????????? 15.61??????? NA 1? 4621? TWF2H6??? 48.02??????? 44.09??????? 13.41????? 0.3041506 2? 4630? TWF2H19? 51.44????? 47.81??????? NA??????????? NA 2? 4631? TWF2H21? NA????????? 52.72??????? 16.70????? 0.3167678 2? 4632? TWF2H22? 55.70????? 50.45??????? 16.48????? 0.3266601 2? 4633? TWF2H23? 44.42????? 40.89??????? 12.96????? 0.3169479 ",sep="",header=TRUE,stringsAsFactors=FALSE) aggregate(df1[,4:7],by=list(df1[,1]), mean,na.rm=T) #? Group.1 EW_INCU EW_17.5? EMW EEratio #1?????? 1??? 46.6??? 44.1 14.5?? 0.304 #2?????? 2??? 50.5??? 48.0 15.4?? 0.320 #or library(plyr) ddply(df1,.(FID),colwise(mean,c("EW_INCU","EW_17.5","EMW","EEratio")),na.rm=TRUE) #? FID EW_INCU EW_17.5? EMW EEratio #1?? 1??? 46.6??? 44.1 14.5?? 0.304 #2?? 2??? 50.5??? 48.0 15.4?? 0.320 #or library(data.table) df2<-data.table(df1) ?df3<-df2[,c(1,4:7),with=FALSE] ?df3[,lapply(.SD,mean,na.rm=TRUE),by=FID] #?? FID EW_INCU EW_17.5? EMW EEratio #1:?? 2??? 50.5??? 48.0 15.4?? 0.320 #2:?? 1??? 46.6??? 44.1 14.5?? 0.304 A.K. ----- Original Message ----- From: Yao He <yao.h.1988 at gmail.com> To: r-help at r-project.org Cc: Sent: Saturday, December 15, 2012 10:44 PM Subject: [R] how to handle NA values in aggregate() Dear All: I am trying to calculate four columns' means in a dataframe like this: FID? MID? ? IID? ? ? ? EW_INCU EW_17.5? EMW? ? ? ? EEratio 1? 4621? TWF2H5? ? 45.26? ? ? ? NA? ? ? ? ? ? 15.61? ? ? ? NA 1? 4621? TWF2H6? ? 48.02? ? ? ? 44.09? ? ? ? 13.41? ? ? 0.3041506 2? 4630? TWF2H19? 51.44? ? ? 47.81? ? ? ? NA? ? ? ? ? ? NA 2? 4631? TWF2H21? NA? ? ? ? ? 52.72? ? ? ? 16.70? ? ? 0.3167678 2? 4632? TWF2H22? 55.70? ? ? 50.45? ? ? ? 16.48? ? ? 0.3266601 2? 4633? TWF2H23? 44.42? ? ? 40.89? ? ? ? 12.96? ? ? 0.3169479 I try this code> aggregate(df[,4:7],df[,1],mean)But I couldn't set the agrument na.rm=T in the mean() function,so the results are all NAs Please tell me how to handle NA values in the use of aggregate() Thanks a lot Yao He ????????????????????????? Master candidate in 2rd year Department of Animal genetics & breeding Room 436,College of Animial Science&Technology, China Agriculture University,Beijing,100193 E-mail: yao.h.1988 at gmail.com ?????????????????????????? ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.