Hi: Apologies for asking the following question. As?this may sound very basic and stupid for this forum?, I honestly do not know how to solve it and I do not have a teacher who can help me understand. ? I have list of genes (200)?that are involved in a particular process and I call this as a?ProcSet.?? From an independent experiment I found that out of 10,000 genes, 1500 are significant and I call these1500 genes as ResultSet.?? ? The intersection of ResultSet and ProcSet are 80 genes.? ? That means 40% of ProcSet are significant.? ? ?How do I calculate that 40% is significant and more than I expect by chance given ResultSet and 10,000 genes I evaluated in the experiment. ? What I have: n = 200 (ProcSet) p = 0.4 ? N = 1500? (ResultSet) ? N1 =10,000? ? Pn = 0.15 ? What kind of test will help me know that 0.4 is significant given 0.15. Any suggestions will greatly help me. ? Thank you. Srini
Hi Srini, This is a statistics question, not a question about R, so this may not be the best place to ask. Try posting at http://stats.stackexchange.com/ or another statistics help list. Best, Ista On Thu, Jan 31, 2013 at 11:11 PM, Srinivas Iyyer <srini_iyyer_bio at yahoo.com> wrote:> Hi: > Apologies for asking the following question. As this may sound very basic and stupid for this forum , I honestly do not know how to solve it and I do not have a teacher who can help me understand. > > I have list of genes (200) that are involved in a particular process and I call this as a ProcSet. From an independent experiment I found that out of 10,000 genes, 1500 are significant and I call these1500 genes as ResultSet. > > The intersection of ResultSet and ProcSet are 80 genes. > > That means 40% of ProcSet are significant. > > How do I calculate that 40% is significant and more than I expect by chance given ResultSet and 10,000 genes I evaluated in the experiment. > > What I have: > n = 200 (ProcSet) > p = 0.4 > > N = 1500 (ResultSet) > > N1 =10,000 > > Pn = 0.15 > > What kind of test will help me know that 0.4 is significant given 0.15. Any suggestions will greatly help me. > > Thank you. > Srini > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hello Srini, It sounds as if you are attempting to establish a prior probability and compare it to the posterior probability -- a perfect candidate for bayesian analysis. I would simply do a search for 'bayesian analysis of gene expression data' -- there are a number of statistical packages that are available. A number of R packages are available as well as a software package from Yale: http://www.yale.edu/townsend/Software/BAGELTutorial.html Hope this helps, James On Thu, Jan 31, 2013 at 11:11 PM, Srinivas Iyyer <srini_iyyer_bio@yahoo.com>wrote:> Hi: > Apologies for asking the following question. As this may sound very basic > and stupid for this forum , I honestly do not know how to solve it and I do > not have a teacher who can help me understand. > > I have list of genes (200) that are involved in a particular process and I > call this as a ProcSet. From an independent experiment I found that out > of 10,000 genes, 1500 are significant and I call these1500 genes as > ResultSet. > > The intersection of ResultSet and ProcSet are 80 genes. > > That means 40% of ProcSet are significant. > > How do I calculate that 40% is significant and more than I expect by > chance given ResultSet and 10,000 genes I evaluated in the experiment. > > What I have: > n = 200 (ProcSet) > p = 0.4 > > N = 1500 (ResultSet) > > N1 =10,000 > > Pn = 0.15 > > What kind of test will help me know that 0.4 is significant given 0.15. > Any suggestions will greatly help me. > > Thank you. > Srini > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- *James C. Whanger* * * *"It ain't what you don't know that gets you into trouble. It's what you know for sure that just ain't so." Mark Twain* [[alternative HTML version deleted]]