Hello list. I am hoping for some help with a relatively simple problem. I have a data frame arranged as below. I want to be able to count the occurrence of each gene (eg let-7e) by Experiment. In other words how many times does a given gene crop up in the dataframe. I tried table but couldn't work out how to get the output I want. I have also considered rearranging this data into a list (by gene) and then counting the length of each gene element. However I thought that there might be a more elegant solution. Tanaka Mitchell Wang Hunter Chen Chim miR-191* let-7e let-7b miR-126 let-7a let-7g miR-198 let-7f let-7c miR-146a let-7b let-7i miR-22 let-7g miR-1224 miR-16 let-7d miR-130b miR-223 let-7i miR-124 miR-191 let-7f miR-133a miR-296 let7fG15A miR-125a-3pmiR-222 let-7g miR-140 miR-30d miR-101 miR-125b-5pmiR-223 let-7i miR-142-5p miR-370 miR-103 miR-133a miR-24 miR-101 miR-146 miR-486 miR-125a-5pmiR-133b miR-26a miR-103 miR-148a miR-498 miR-126 miR-135a* miR-32 miR-106a miR-152 Thanks for any advice. Cheers Iain
Try this: Lines <- "Tanaka Mitchell Wang Hunter Chen Chim miR-191* let-7e let-7b miR-126 let-7a let-7g miR-198 let-7f let-7c miR-146a let-7b let-7i miR-22 let-7g miR-1224 miR-16 let-7d miR-130b miR-223 let-7i miR-124 miR-191 let-7f miR-133a miR-296 let7fG15A miR-125a-3p miR-222 let-7g miR-140 miR-30d miR-101 miR-125b-5p miR-223 let-7i miR-142-5p miR-370 miR-103 miR-133a miR-24 miR-101 miR-146 miR-486 miR-125a-5p miR-133b miR-26a miR-103 miR-148a miR-498 miR-126 miR-135a* miR-32 miR-106a miR-152" DF <- read.table(textConnection(Lines), header = TRUE, as.is = TRUE) xtabs(~ values + ind, stack(DF)) On Sat, May 23, 2009 at 8:44 AM, Iain Gallagher <iaingallagher at btopenworld.com> wrote:> > Hello list. > > I am hoping for some help with a relatively simple problem. I have a data frame arranged as below. I want to be able to count the occurrence of each gene (eg let-7e) by Experiment. In other words how many times does a given gene crop up in the dataframe. I tried table but couldn't work out how to get the output I want. I have also considered rearranging this data into a list (by gene) and then counting the length of each gene element. However I thought that there might be a more elegant solution. > > Tanaka ? ? Mitchell ? Wang ? ? ? Hunter ? ? Chen ? ? ? Chim > miR-191* ? let-7e ? ? let-7b ? ? miR-126 ? ?let-7a ? ? let-7g > miR-198 ? ?let-7f ? ? let-7c ? ? miR-146a ? let-7b ? ? let-7i > miR-22 ? ? let-7g ? ? miR-1224 ? miR-16 ? ? let-7d ? ? miR-130b > miR-223 ? ?let-7i ? ? miR-124 ? ?miR-191 ? ?let-7f ? ? miR-133a > miR-296 ? ?let7fG15A ?miR-125a-3pmiR-222 ? ?let-7g ? ? miR-140 > miR-30d ? ?miR-101 ? ?miR-125b-5pmiR-223 ? ?let-7i ? ? miR-142-5p > miR-370 ? ?miR-103 ? ?miR-133a ? miR-24 ? ? miR-101 ? ?miR-146 > miR-486 ? ?miR-125a-5pmiR-133b ? miR-26a ? ?miR-103 ? ?miR-148a > miR-498 ? ?miR-126 ? ?miR-135a* ?miR-32 ? ? miR-106a ? miR-152 > > Thanks for any advice. > > Cheers > > Iain > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On Sat, 23 May 2009 12:44:19 +0000 (GMT) Iain Gallagher <iaingallagher at btopenworld.com> wrote: IG> I am hoping for some help with a relatively simple problem. I have IG> a data frame arranged as below. I want to be able to count the IG> occurrence of each gene (eg let-7e) by Experiment. In other words IG> how many times does a given gene crop up in the dataframe. IG> Tanaka Mitchell Wang Hunter Chen Chim IG> miR-191* let-7e let-7b miR-126 let-7a let-7g IG> miR-198 let-7f let-7c miR-146a let-7b let-7i Hi Iain, I would rearrange the dataframe to the following structure: Gene Experiment let-7e Mitchell ... then you can use ftable. probably you could use reshape for your data. I could not figure out how to do it without a time var, so here a quick and dirty way: exp.wide<-data.frame(Tim=c("a","a","b"),Struppi=c("b","b","b"),More=c("a","b","b")) nam<-names(exp.wide) ctL<-length(exp.wide[1,]) ctW<-length(exp.wide[,1]) exp.long<-cbind(as.character(exp.wide[,1]),rep(nam[1],ctW)) for(i in 2:ctL){ tmp<-cbind(as.character(exp.wide[,i]),nam[i]) exp.long<-rbind(exp.long,tmp) } exp.long<-as.data.frame(exp.long) names(exp.long)<-c("gene","exp") ftable(exp.long) hth Stefan