ads pit
2011-Jun-08 22:06 UTC
[R] return counts of elements on a table column depending on elements on another column
Hi, I am given the following table:> head(hsa_refseq)chr genome region start stop nu strand nu.1 nu.2 gene_id 1 chr1 hg19_refGene CDS 67000042 67000051 0 + 0 gene_id NM_032291 2 chr1 hg19_refGene exon 66999825 67000051 0 + . gene_id NM_032291 3 chr1 hg19_refGene CDS 67091530 67091593 0 + 2 gene_id NM_032291 4 chr1 hg19_refGene exon 67091530 67091593 0 + . gene_id NM_032291 5 chr1 hg19_refGene CDS 67098753 67098777 0 + 1 gene_id NM_032291 6 chr2 hg19_refGene exon 67098753 67098777 0 + . gene_id NM_032291 What I've done is to find out how many of the elements on 3rd column are "CDS", "exon". sum(hsa_refseq$region=="CDS") sum(hsa_refseq$region=="exon") But what I would like is to print for each chromosome how many are exons and how many CDS. For example chr1 has 5 CDS and 2 exons chr2 has 10 CDS and 3 exons... Can you tell what should I add? Or if I am doing this wrong, how should I do it? Thank you, Regards, Nanami [[alternative HTML version deleted]]
Martin Morgan
2011-Jun-08 22:34 UTC
[R] return counts of elements on a table column depending on elements on another column
On 06/08/2011 03:06 PM, ads pit wrote:> Hi, > I am given the following table: >> head(hsa_refseq) > chr genome region start stop nu strand nu.1 nu.2 > gene_id > 1 chr1 hg19_refGene CDS 67000042 67000051 0 + 0 gene_id > NM_032291 > 2 chr1 hg19_refGene exon 66999825 67000051 0 + . gene_id > NM_032291 > 3 chr1 hg19_refGene CDS 67091530 67091593 0 + 2 gene_id > NM_032291 > 4 chr1 hg19_refGene exon 67091530 67091593 0 + . gene_id > NM_032291 > 5 chr1 hg19_refGene CDS 67098753 67098777 0 + 1 gene_id > NM_032291 > 6 chr2 hg19_refGene exon 67098753 67098777 0 + . gene_id > NM_032291 > > What I've done is to find out how many of the elements on 3rd column are > "CDS", "exon". > sum(hsa_refseq$region=="CDS") > sum(hsa_refseq$region=="exon") > > But what I would like is to print for each chromosome how many are exons > and how many CDS. For example > chr1 has 5 CDS and 2 exons > chr2 has 10 CDS and 3 exons... > > Can you tell what should I add? Or if I am doing this wrong, how should I do > it?Hi Nanami -- xtabs(~chr + region, hsa_refseq) might do the ticket. Martin> > Thank you, > Regards, > Nanami > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Computational Biology Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: M1-B861 Telephone: 206 667-2793