Atul Kakrana
2013-Apr-03 22:17 UTC
[R] Select single probe-set with median expression from multiple probe-sets corresponding to same gene -AFFY
Hello All, I need your help. I am analysing affymetrix data and have to select the probe-set that has median expression among all the probe-sets for same gene. This way I want to remove the redundancy by keeping the analysis to single gene entry level. I am fully aware that it is not a nice thing to do but I just have to do it. To do so, I came across 'findLargest' function of 'genefilter' package but it's not well documented; and I do not know how to implement the 'findLargest' function. At this point I have: esetRMA <- rma(mydata) Could anybody guide me on how can I select single probeset with median expression from multiple probe-sets corresponding to single gene and discard others? Is there any other way to achieve so i.e. other than using 'genefilter'? Genefilter package: http://www.bioconductor.org/packages/2.11/bioc/html/genefilter.html Thanks AK -- Atul Kakrana Delaware Technology Park
Martin Morgan
2013-Apr-04 03:34 UTC
[R] Select single probe-set with median expression from multiple probe-sets corresponding to same gene -AFFY
On 04/03/2013 03:17 PM, Atul Kakrana wrote:> Hello All, > > I need your help. I am analysing affymetrix data and have to select the > probe-set that has median expression among all the probe-sets for same > gene. This way I want to remove the redundancy by keeping the analysis > to single gene entry level. I am fully aware that it is not a nice thing > to do but I just have to do it. > > To do so, I came across 'findLargest' function of 'genefilter' package > but it's not well documented; and I do not know how to implement the > 'findLargest' function. At this point I have: > esetRMA <- rma(mydata) > > Could anybody guide me on how can I select single probeset with median > expression from multiple probe-sets corresponding to single gene and > discard others? Is there any other way to achieve so i.e. other than > using 'genefilter'? > > Genefilter package: > http://www.bioconductor.org/packages/2.11/bioc/html/genefilter.htmlHi Atul --It's a Bioconductor package, so might as well ask instead on the Bioconductor mailing list http://bioconductor.org/help/mailing-list/ As a reproducible example, load the "ALL" sample ExpressionSet, Biobase and genefilter packates library(Biobase) library(ALL) library(genefilter) The three arguments to findLargest are the names of the probe sets featureNames(ALL) the test statistic rowMedians(ALL) and the chip from which the ExpressionSet is based annotation(ALL) So the variable idx = findLargest(featureNames(ALL), rowMedians(ALL), annotation(ALL) identifies the probes and ALL1 = ALL[idx,] gets you the data you're interested in. Again, follow-up questions should go to the Bioconductor mailing list. Martin> > Thanks > > AK >-- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793