Fabrice Tourre
2011-Apr-06 08:48 UTC
[R] Calculated mean value based on another column bin from dataframe.
Dear list, I have a dataframe with two column as fellow.> head(dat)V1 V2 0.15624 0.94567 0.26039 0.66442 0.16629 0.97822 0.23474 0.72079 0.11037 0.83760 0.14969 0.91312 I want to get the column V2 mean value based on the bin of column of V1. I write the code as fellow. It works, but I think this is not the elegant way. Any suggestions? dat<-read.table("dat.txt",head=F) ran<-seq(0,0.5,0.05) mm<-NULL for (i in c(1:(length(ran)-1))) { fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] m<-mean(dat[fil,2]) mm<-c(mm,m) } mm Here is the first 20 lines of my data.> dput(head(dat,20))structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame")
Henrique Dallazuanna
2011-Apr-06 12:16 UTC
[R] Calculated mean value based on another column bin from dataframe.
Try this: fil <- sapply(ran, '<', e1 = dat[,1]) & sapply(ran[2:(length(ran) + 1)], '>=', e1 = dat[,1]) mm <- apply(fil, 2, function(idx)mean(dat[idx, 2])) On Wed, Apr 6, 2011 at 5:48 AM, Fabrice Tourre <fabrice.ciup at gmail.com> wrote:> Dear list, > > I have a dataframe with two column as fellow. > >> head(dat) > ? ? ? V1 ? ? ?V2 > ?0.15624 0.94567 > ?0.26039 0.66442 > ?0.16629 0.97822 > ?0.23474 0.72079 > ?0.11037 0.83760 > ?0.14969 0.91312 > > I want to get the column V2 mean value based on the bin of column of > V1. I write the code as fellow. It works, but I think this is not the > elegant way. Any suggestions? > > dat<-read.table("dat.txt",head=F) > ran<-seq(0,0.5,0.05) > mm<-NULL > for (i in c(1:(length(ran)-1))) > { > ? ?fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] > ? ?m<-mean(dat[fil,2]) > ? ?mm<-c(mm,m) > } > mm > > Here is the first 20 lines of my data. > >> dput(head(dat,20)) > structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, > 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, > 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, > 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, > 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, > 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 > )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class = "data.frame") > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paran?-Brasil 25? 25' 40" S 49? 16' 22" O
Petr PIKAL
2011-Apr-06 15:22 UTC
[R] Odp: Calculated mean value based on another column bin from dataframe.
Hi r-help-bounces at r-project.org napsal dne 06.04.2011 10:48:04:> Dear list, > > I have a dataframe with two column as fellow. > > > head(dat) > V1 V2 > 0.15624 0.94567 > 0.26039 0.66442 > 0.16629 0.97822 > 0.23474 0.72079 > 0.11037 0.83760 > 0.14969 0.91312 > > I want to get the column V2 mean value based on the bin of column of > V1. I write the code as fellow. It works, but I think this is not the > elegant way. Any suggestions?Do you want something like that? #make data x<-runif(100) y<-runif(100) #cut first column to bins (in your case dat[,1] and ran) x.c<-cut(x, seq(0,1,.1)) #aggregate column 2 according to bins (in your case dat[,2]) aggregate(y,list(x.c), mean) Group.1 x 1 (0,0.1] 0.5868734 2 (0.1,0.2] 0.5436263 3 (0.2,0.3] 0.5099366 4 (0.3,0.4] 0.4815855 5 (0.4,0.5] 0.4137687 6 (0.5,0.6] 0.4698156 7 (0.6,0.7] 0.4687639 8 (0.7,0.8] 0.5661048 9 (0.8,0.9] 0.5489297 10 (0.9,1] 0.4812521 Regards Petr> > dat<-read.table("dat.txt",head=F) > ran<-seq(0,0.5,0.05) > mm<-NULL > for (i in c(1:(length(ran)-1))) > { > fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] > m<-mean(dat[fil,2]) > mm<-c(mm,m) > } > mm > > Here is the first 20 lines of my data. > > > dput(head(dat,20)) > structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, > 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, > 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, > 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, > 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, > 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 > )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class ="data.frame")> > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.
Fabrice Tourre
2011-Apr-06 21:56 UTC
[R] Odp: Calculated mean value based on another column bin from dataframe.
This is extractly what I want. Thank you very much. On Wed, Apr 6, 2011 at 5:22 PM, Petr PIKAL <petr.pikal at precheza.cz> wrote:> Hi > > > r-help-bounces at r-project.org napsal dne 06.04.2011 10:48:04: > >> Dear list, >> >> I have a dataframe with two column as fellow. >> >> > head(dat) >> ? ? ? ?V1 ? ? ?V2 >> ?0.15624 0.94567 >> ?0.26039 0.66442 >> ?0.16629 0.97822 >> ?0.23474 0.72079 >> ?0.11037 0.83760 >> ?0.14969 0.91312 >> >> I want to get the column V2 mean value based on the bin of column of >> V1. I write the code as fellow. It works, but I think this is not the >> elegant way. Any suggestions? > > Do you want something like that? > > #make data > x<-runif(100) > y<-runif(100) > > #cut first column to bins (in your case dat[,1] and ran) > x.c<-cut(x, seq(0,1,.1)) > > #aggregate column 2 according to bins (in your case dat[,2]) > aggregate(y,list(x.c), mean) > ? ? Group.1 ? ? ? ? x > 1 ? ?(0,0.1] 0.5868734 > 2 ?(0.1,0.2] 0.5436263 > 3 ?(0.2,0.3] 0.5099366 > 4 ?(0.3,0.4] 0.4815855 > 5 ?(0.4,0.5] 0.4137687 > 6 ?(0.5,0.6] 0.4698156 > 7 ?(0.6,0.7] 0.4687639 > 8 ?(0.7,0.8] 0.5661048 > 9 ?(0.8,0.9] 0.5489297 > 10 ? (0.9,1] 0.4812521 > > Regards > Petr > >> >> dat<-read.table("dat.txt",head=F) >> ran<-seq(0,0.5,0.05) >> mm<-NULL >> for (i in c(1:(length(ran)-1))) >> { >> ? ? fil<- dat[,1] > ran[i] & dat[,1]<=ran[i+1] >> ? ? m<-mean(dat[fil,2]) >> ? ? mm<-c(mm,m) >> } >> mm >> >> Here is the first 20 lines of my data. >> >> > dput(head(dat,20)) >> structure(list(V1 = c(0.15624, 0.26039, 0.16629, 0.23474, 0.11037, >> 0.14969, 0.16166, 0.09785, 0.36417, 0.08005, 0.29597, 0.14856, >> 0.17307, 0.36718, 0.11621, 0.23281, 0.10415, 0.1025, 0.04238, >> 0.13525), V2 = c(0.94567, 0.66442, 0.97822, 0.72079, 0.8376, >> 0.91312, 0.88463, 0.82432, 0.55582, 0.9429, 0.78956, 0.93424, >> 0.87692, 0.83996, 0.74552, 0.9779, 0.9958, 0.9783, 0.92523, 0.99022 >> )), .Names = c("V1", "V2"), row.names = c(NA, 20L), class > "data.frame") >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >