Dear List, We are planning a genotyping study to be analyzed using false discovery rates (FDRs) (See Storey and Tibshirani PNAS 2003; 100:9440-5). I am interested in learning if there is any consensus as to how many features (ie. how many P values) need to be studied before reasonably reliable FDRs can be derived. Does anyone know of a citation where this is discussed? Bill Dupont William D. Dupont phone: 615-343-4100 URL http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/WilliamDupont
Two thoughts on this: 1. Your FDR (Not Franklin Delano Roosevelt) sounds like another name for Type I error rate. The definition of "reasonably reliable FDRs" would seem to relate to the status of the literature on this issue among researchers in genotyping. As more reports of FRDs in genotyping are published, I would expect that methodology for estimation and the standard for accuracy would similarly evolve. 2. Have you tried the Bioconductor (www.bioconductor.org/) listserve? They might be able to say something more useful than a general list like this. spencer graves Dupont, William wrote:> Dear List, > > We are planning a genotyping study to be analyzed using false discovery > rates (FDRs) (See Storey and Tibshirani PNAS 2003; 100:9440-5). I am > interested in learning if there is any consensus as to how many > features (ie. how many P values) need to be studied before reasonably > reliable FDRs can be derived. Does anyone know of a citation where > this is discussed? > > Bill Dupont > > William D. Dupont phone: 615-343-4100 URL > http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/WilliamDupont > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915
Kjetil Brinchmann Halvorsen
2005-Sep-22 02:45 UTC
[R] FDR analyses: minimum number of features
Spencer Graves wrote:> Two thoughts on this: > > 1. Your FDR (Not Franklin Delano Roosevelt) sounds like another name >for Type I error rate. >It is certainly not the same as type I error rate. Type I error rate is the proportion of true nulls which are rejected, while the FDR is the proportion of rejected null hypothesis which really are true nulls! To me FDR seems like a more promising avenue to multiple testing than the old "familywise error rate". Who knows what is a family? Kjetil> The definition of "reasonably reliable FDRs" >would seem to relate to the status of the literature on this issue among >researchers in genotyping. As more reports of FRDs in genotyping are >published, I would expect that methodology for estimation and the >standard for accuracy would similarly evolve. > > 2. Have you tried the Bioconductor (www.bioconductor.org/) >listserve? They might be able to say something more useful than a >general list like this. > > spencer graves > >Dupont, William wrote: > > > >>Dear List, >> >>We are planning a genotyping study to be analyzed using false discovery >>rates (FDRs) (See Storey and Tibshirani PNAS 2003; 100:9440-5). I am >>interested in learning if there is any consensus as to how many >>features (ie. how many P values) need to be studied before reasonably >>reliable FDRs can be derived. Does anyone know of a citation where >>this is discussed? >> >>Bill Dupont >> >>William D. Dupont phone: 615-343-4100 URL >>http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/WilliamDupont >> >>______________________________________________ >>R-help at stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >> >> > > >-- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus.
I agree. What is unclear to me is the optimal way of justifying sample size and SNP selection in grant applications that use the FDR approach. -----Original Message----- From: Kjetil Brinchmann Halvorsen [mailto:kjetil at acelerate.com] Sent: Wednesday, September 21, 2005 9:45 PM To: Spencer Graves Cc: Dupont, William; r-help at stat.math.ethz.ch Subject: Re: [R] FDR analyses: minimum number of features Spencer Graves wrote:> Two thoughts on this: > > 1. Your FDR (Not Franklin Delano Roosevelt) sounds likeanother>name for Type I error rate. >It is certainly not the same as type I error rate. Type I error rate is the proportion of true nulls which are rejected, while the FDR is the proportion of rejected null hypothesis which really are true nulls! To me FDR seems like a more promising avenue to multiple testing than the old "familywise error rate". Who knows what is a family? Kjetil> The definition of "reasonably reliable FDRs" >would seem to relate to the status of the literature on this issue >among researchers in genotyping. As more reports of FRDs in genotyping>are published, I would expect that methodology for estimation and the >standard for accuracy would similarly evolve. > > 2. Have you tried the Bioconductor (www.bioconductor.org/) >listserve? They might be able to say something more useful than a >general list like this. > > spencer graves > >Dupont, William wrote: > > > >>Dear List, >> >>We are planning a genotyping study to be analyzed using false >>discovery rates (FDRs) (See Storey and Tibshirani PNAS 2003; >>100:9440-5). I am interested in learning if there is any consensus as>>to how many features (ie. how many P values) need to be studied before>>reasonably reliable FDRs can be derived. Does anyone know of a >>citation where this is discussed? >> >>Bill Dupont >> >>William D. Dupont phone: 615-343-4100 URL >>http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/WilliamDupont >> >>______________________________________________ >>R-help at stat.math.ethz.ch mailing list >>https://stat.ethz.ch/mailman/listinfo/r-help >>PLEASE do read the posting guide! >>http://www.R-project.org/posting-guide.html >> >> > > >-- Kjetil Halvorsen. Peace is the most effective weapon of mass construction. -- Mahdi Elmandjra -- No virus found in this outgoing message. Checked by AVG Anti-Virus. 20/09/2005
Have you considered Monte Carlo? From previous work, you could estimate a distribution for the differences to be detected and use that as input to a Monte Carlo, computing thereby a distribution for FDR as a function of distribution of differences and the number of features. From this, you could estimate probabilities for obtaining results that were bogus vs. marginal, barely useful vs. highly accurate and plot them vs. alternative budgets, etc. I hope this comment makes more sense than my earlier nonsense. spencer graves Dupont, William wrote:> I agree. What is unclear to me is the optimal way of justifying sample > size and SNP selection in grant applications that use the FDR approach. > > -----Original Message----- > From: Kjetil Brinchmann Halvorsen [mailto:kjetil at acelerate.com] > Sent: Wednesday, September 21, 2005 9:45 PM > To: Spencer Graves > Cc: Dupont, William; r-help at stat.math.ethz.ch > Subject: Re: [R] FDR analyses: minimum number of features > > Spencer Graves wrote: > > >> Two thoughts on this: >> >> 1. Your FDR (Not Franklin Delano Roosevelt) sounds like > > another > >>name for Type I error rate. >> > > It is certainly not the same as type I error rate. Type I error rate is > the proportion of true nulls which are rejected, while the FDR is the > proportion of rejected null hypothesis which really are true nulls! > > To me FDR seems like a more promising avenue to multiple testing than > the old "familywise error rate". Who knows what is a family? > > Kjetil > > >>The definition of "reasonably reliable FDRs" >>would seem to relate to the status of the literature on this issue >>among researchers in genotyping. As more reports of FRDs in genotyping > > >>are published, I would expect that methodology for estimation and the >>standard for accuracy would similarly evolve. >> >> 2. Have you tried the Bioconductor (www.bioconductor.org/) >>listserve? They might be able to say something more useful than a >>general list like this. >> >> spencer graves >> >>Dupont, William wrote: >> >> >> >> >>>Dear List, >>> >>>We are planning a genotyping study to be analyzed using false >>>discovery rates (FDRs) (See Storey and Tibshirani PNAS 2003; >>>100:9440-5). I am interested in learning if there is any consensus as > > >>>to how many features (ie. how many P values) need to be studied before > > >>>reasonably reliable FDRs can be derived. Does anyone know of a >>>citation where this is discussed? >>> >>>Bill Dupont >>> >>>William D. Dupont phone: 615-343-4100 URL >>>http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/WilliamDupont >>> >>>______________________________________________ >>>R-help at stat.math.ethz.ch mailing list >>>https://stat.ethz.ch/mailman/listinfo/r-help >>>PLEASE do read the posting guide! >>>http://www.R-project.org/posting-guide.html >>> >>> >> >> >> > > >-- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA spencer.graves at pdf.com www.pdf.com <http://www.pdf.com> Tel: 408-938-4420 Fax: 408-280-7915
Thanks. That is an excellent idea. Bill -----Original Message----- From: Spencer Graves [mailto:spencer.graves at pdf.com] Sent: Thursday, September 22, 2005 9:01 PM To: Dupont, William Cc: Kjetil Brinchmann Halvorsen; r-help at stat.math.ethz.ch Subject: Re: [R] FDR analyses: minimum number of features Have you considered Monte Carlo? From previous work, you could estimate a distribution for the differences to be detected and use that as input to a Monte Carlo, computing thereby a distribution for FDR as a function of distribution of differences and the number of features. From this, you could estimate probabilities for obtaining results that were bogus vs. marginal, barely useful vs. highly accurate and plot them vs. alternative budgets, etc. I hope this comment makes more sense than my earlier nonsense. spencer graves