Waverley @ Palo Alto wrote:> Hi,
>
> I have a list of IPI gene IDs. I want to find out whether there is a
> package which can map the gene ontology to these IPIs, and plot the
> pie chart to demonstrate the molecular function distributions.
>
> The input is like the following gene IPI IDs:
>
IPI:IPI00008860.1|SWISS-PROT:Q9BXJ4-1|TREMBL:Q542Y2|ENSEMBL:ENSP00000231338;EN
>
IPI:IPI00019922.5|SWISS-PROT:Q8N0Y2-1|TREMBL:Q53F81|ENSEMBL:ENSP00000338860;ENSP00000375594|REFSEQ:NP_060807|H-INV:HIT000028861|VEGA:OTTHUMP00000078377
> Tax_Id=9606 Gene_Symbol=ZN
>
IPI:IPI00647423.2|SWISS-PROT:Q8N819-1|REFSEQ:NP_001073870|VEGA:OTTHUMP00000076687
> Tax_Id=9606 Gene_Symbol=FLJ40125 Isoform 1 of
>
IPI:IPI00219000.2|SWISS-PROT:P27658|TREMBL:Q53XI6|ENSEMBL:ENSP00000261037|REFS
>
IPI:IPI00291878.4|SWISS-PROT:P35247|ENSEMBL:ENSP00000361366|REFSEQ:NP_003010|H-INV:HIT000039466|VEGA:OTTHUMP00000019944
>
IPI:IPI00013945.1|SWISS-PROT:P07911-1|TREMBL:Q8NHW8|ENSEMBL:ENSP00000306279|RE
>
IPI:IPI00000634.1|SWISS-PROT:Q16204|TREMBL:Q6GSG7|ENSEMBL:ENSP00000263102|REFS
>
> I want to plot the pie chart of these gene distribution in the GO
> molecular function as a pie chart. An example is shown in the
> following link http://www.proteomesci.com/content/7/1/6/figure/F2?highres=y
>
>
> Can some one help?
Not sure that it is this easy. The IPI are protein identifiers. GO
categories classify genes. Neither the mapping from protein to gene or
gene to GO category is 1:1. GO categories form a hierarchy. So there are
significant decisions to be made in representing IPI identifiers in a
pie chart of GO terms.
Bioconductor maintains 'org' and 'GO' database packages that
provide the
necessary link between IPI protein ids and GO gene ontology categories,
via ENTREZ gene ids. Code might look like
## once only, to install packages
source('http://bioconductor.org/biocLite.R')
biocLite('org.Hs.eg.db', 'GO.db')
## from IPI to ENTREZ id, not 1:1
library(org.Hs.eg.db)
ipi2eg = revmap(eapply(org.Hs.eg.db, names)) ## NOT 1:1 map
## Assume ipiIds is, e.g., c('IPI00008860', 'IPI00019922')
egIds = revmap(ipi2eg[ipiIds])
## get GO terms, also not 1:1
goIds = eapply(org.Hs.egGO[names(egIds)], names)
You're still left with the problem of resolving multiple mappings and
the hierarchical relationship between GO terms. Asking on the
Bioconductor mailing list
http://bioconductor.org/docs/mailList.html
is likely to lead to helpful answers.
Martin
> Thanks much in advance.
>
> Merry Christmas!!
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793