Dan Kortschak
2009-Mar-28 02:15 UTC
[R] calculating average for multiple subclasses in a data set
Hello R users, I have a data set which is a set of lengths and types of objects. I want to calculate the mean length for each type of object as opposed to the mean of all the objects in the set. This is in order to make a comparison between the lengths of each type of objects and the number of those objects.> xChromosome Begin End Type Class Norm Length 458327 Y 1 318 L2_Plat1b LINE/L2 5.758902 317 458330 Y 439 673 L2_Plat1i LINE/L2 5.455321 234 458331 Y 2 309 L2_Plat1i LINE/L2 5.726848 307 458332 Y 1746 2232 L2_Plat1d LINE/L2 6.186209 486 458333 Y 948 1132 L2_Plat1e LINE/L2 5.214936 184 458335 Y 1511 2043 L2_Plat1f LINE/L2 6.276643 532 458336 Y 1 908 L2_Plat1f LINE/L2 6.810142 907 458337 Y 907 1037 L2_Plat1g LINE/L2 4.867534 130 So a toy set for the relevant parts of the data would be e.g.: type<-sample(c("L2_Plat1a","L2_Plat1b","L2_Plat1c"),1000,replace=TRUE) len<-rnorm(1000) dummy<-as.data.frame(cbind(as.character(type),len)) so looking for as.data.frame(summary(dummy$V1)) ~ /*average of each type's length*/ as my final goal. I apologise for the syntax I use (I've come only recently from a perl background, so there is a certain messiness and lack of consideration for style to my coding) - I'm still having a really difficult time figuring out how various data types are used and manipulated in R, but I think I'm slowly getting the hang of it, but any suggestions of a good reference for that (other than the R Introduction which didn't help all that much), would be greatly appreciated. thanks for any help Dan
Jorge Ivan Velez
2009-Mar-28 02:31 UTC
[R] calculating average for multiple subclasses in a data set
Dear Dan, Try this: with(dummy,tapply(as.numeric(len),as.factor(V1),summary)) See ?with, ?tapply and ?summary for more information. HTH, Jorge On Fri, Mar 27, 2009 at 10:15 PM, Dan Kortschak < dan.kortschak@adelaide.edu.au> wrote:> Hello R users, > > I have a data set which is a set of lengths and types of objects. I want > to calculate the mean length for each type of object as opposed to the > mean of all the objects in the set. > > This is in order to make a comparison between the lengths of each type > of objects and the number of those objects. > > > x > Chromosome Begin End Type Class Norm Length > 458327 Y 1 318 L2_Plat1b LINE/L2 5.758902 317 > 458330 Y 439 673 L2_Plat1i LINE/L2 5.455321 234 > 458331 Y 2 309 L2_Plat1i LINE/L2 5.726848 307 > 458332 Y 1746 2232 L2_Plat1d LINE/L2 6.186209 486 > 458333 Y 948 1132 L2_Plat1e LINE/L2 5.214936 184 > 458335 Y 1511 2043 L2_Plat1f LINE/L2 6.276643 532 > 458336 Y 1 908 L2_Plat1f LINE/L2 6.810142 907 > 458337 Y 907 1037 L2_Plat1g LINE/L2 4.867534 130 > > So a toy set for the relevant parts of the data would be e.g.: > > type<-sample(c("L2_Plat1a","L2_Plat1b","L2_Plat1c"),1000,replace=TRUE) > len<-rnorm(1000) > dummy<-as.data.frame(cbind(as.character(type),len)) > > so looking for > > as.data.frame(summary(dummy$V1)) ~ /*average of each type's length*/ > > as my final goal. > > I apologise for the syntax I use (I've come only recently from a perl > background, so there is a certain messiness and lack of consideration > for style to my coding) - I'm still having a really difficult time > figuring out how various data types are used and manipulated in R, but I > think I'm slowly getting the hang of it, but any suggestions of a good > reference for that (other than the R Introduction which didn't help all > that much), would be greatly appreciated. > > thanks for any help > Dan > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]