David Swanepoel
2021-Aug-24 14:44 UTC
[R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function
Dear R Core Dev Team, I hope all is well your side! My apologies if this is not the correct point of contact to use to address this. If not, kindly advise or forward my request to the relevant team/persons. I have a query regarding the 'Hochberg' method of the stats/p.adjust R package and hope you can assist me please. I have attached the data I used in Excel, which are lists of p-values for two different tests (Hardy Weinberg Equilibrium and Linkage Disequilibrium) for four population groups. The basis of my concern is a discrepancy specifically between the Hochberg correction applied by four different R packages and the results of the Hochberg correction by the online tool, MultipleTesting.com<http://www.multipletesting.com/>. Using the below R packages/functions, I ran multiple test correction (MTC) adjustments for the p-values listed in my dataset. All R packages below agreed with each other regarding the 'significance' of the p-values for the Hochberg adjustment. * stats/p.adjust (method: Hochberg) * mutoss/hochberg * multtest/mt.rawp2adjp (procedure: Hochberg) * elitism/mtp (method: Hochberg) In checking the same values on the MultipleTesting.com, more p-values were flagged as significant for both the HWE and LD results across all four populations. I show these differences in the Excel sheet attached. Essentially, using the R packages, only the first HWE p-value of Pop2 is significant at an alpha of 0.05. Using the MT.com tool, however, multiple p-values are shown to be significant across both tests with the Hochberg correction (the highlighted cells in the Excel sheet). I asked the authors of MT.com about this, and they gave the following response: "we have checked the issue, and we believe the computation by our page is correct (I cannot give opinion about the other packages). When we look on the original Hochberg paper, and we only use the very first (smallest) p value, then m"=1, thus, according to the equation in the Hochberg 1988 paper, in this case practically there is no further correction necessary. In other words, in case the *smallest* p value is smaller than alpha, then the *smallest* p value will remain significant irrespective of the other p values when we make the Hochberg correction." I have attached the Hochberg paper here but, unfortunately, I don't understand enough of the stats to verify this. I have applied their logic on the same Excel sheet under the section "MT.com explanation", which shows why they consider the highlighted values significant. I have also attached the 2 R files that I used to do the MTC runs and they can be run as is. They are just quite long as they contain many of the other MTC methods in the different packages too. Kindly provide your thoughts as to whether you agree with this interpretation of the Hochberg paper or not? I would like to see concordance between the MT.com tool and the different R packages above (or understand why they are different), so that I can be more confident in the explanations of my own results as a stats layman. I hope this makes sense. Please let me know if I need to clarify anything. Many thanks and kind regards, David -------------- next part -------------- A non-text attachment was scrubbed... Name: Hochberg - 1988 - A Sharper Bonferroni Procedure for Multiple Tests of Significance.pdf Type: application/pdf Size: 325078 bytes Desc: Hochberg - 1988 - A Sharper Bonferroni Procedure for Multiple Tests of Significance.pdf URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20210824/19a37779/attachment.pdf>
Bert Gunter
2021-Aug-24 17:50 UTC
[R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function
1. No Excel attachments made it through. Binary attachments are generally stripped by the list server for security reasons. 2. As you may have already learned, this is the wrong forum for statistics or package specific questions. Read *and follow* the posting guide linked below to post on r-help appropriately. In particular, for questions about specific non-standard packages, contact package maintainers (found through e.g. ?maintainers) 3. Statistics issues generally don't belong here. Try stats.stackexchange.com instead perhaps. 4. We are not *R Core development,* and you probably should not be contacting them either. See here for general guidelines for R lists: https://www.r-project.org/mail.html Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Tue, Aug 24, 2021 at 10:39 AM David Swanepoel <davidswanepoel at hotmail.com> wrote:> > Dear R Core Dev Team, I hope all is well your side! > My apologies if this is not the correct point of contact to use to address this. If not, kindly advise or forward my request to the relevant team/persons. > > I have a query regarding the 'Hochberg' method of the stats/p.adjust R package and hope you can assist me please. I have attached the data I used in Excel, which are lists of p-values for two different tests (Hardy Weinberg Equilibrium and Linkage Disequilibrium) for four population groups. > > The basis of my concern is a discrepancy specifically between the Hochberg correction applied by four different R packages and the results of the Hochberg correction by the online tool, MultipleTesting.com<http://www.multipletesting.com/>. > > Using the below R packages/functions, I ran multiple test correction (MTC) adjustments for the p-values listed in my dataset. All R packages below agreed with each other regarding the 'significance' of the p-values for the Hochberg adjustment. > > > * stats/p.adjust (method: Hochberg) > * mutoss/hochberg > * multtest/mt.rawp2adjp (procedure: Hochberg) > * elitism/mtp (method: Hochberg) > > In checking the same values on the MultipleTesting.com, more p-values were flagged as significant for both the HWE and LD results across all four populations. I show these differences in the Excel sheet attached. > Essentially, using the R packages, only the first HWE p-value of Pop2 is significant at an alpha of 0.05. Using the MT.com tool, however, multiple p-values are shown to be significant across both tests with the Hochberg correction (the highlighted cells in the Excel sheet). > > > I asked the authors of MT.com about this, and they gave the following response: > > "we have checked the issue, and we believe the computation by our page is correct (I cannot give opinion about the other packages). > When we look on the original Hochberg paper, and we only use the very first (smallest) p value, then m"=1, thus, according to the equation in the Hochberg 1988 paper, in this case practically there is no further correction necessary. > In other words, in case the *smallest* p value is smaller than alpha, then the *smallest* p value will remain significant irrespective of the other p values when we make the Hochberg correction." > > I have attached the Hochberg paper here but, unfortunately, I don't understand enough of the stats to verify this. I have applied their logic on the same Excel sheet under the section "MT.com explanation", which shows why they consider the highlighted values significant. > > I have also attached the 2 R files that I used to do the MTC runs and they can be run as is. They are just quite long as they contain many of the other MTC methods in the different packages too. > > Kindly provide your thoughts as to whether you agree with this interpretation of the Hochberg paper or not? I would like to see concordance between the MT.com tool and the different R packages above (or understand why they are different), so that I can be more confident in the explanations of my own results as a stats layman. > > I hope this makes sense. Please let me know if I need to clarify anything. > > > Many thanks and kind regards, > David > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rolf Turner
2021-Aug-24 21:46 UTC
[R] Query regarding stats/p.adjust package (base) - specifically 'Hochberg' function
On Tue, 24 Aug 2021 14:44:55 +0000 David Swanepoel <davidswanepoel at hotmail.com> wrote:> Dear R Core Dev Team, I hope all is well your side! > My apologies if this is not the correct point of contact to use to > address this. If not, kindly advise or forward my request to the > relevant team/persons. > > I have a query regarding the 'Hochberg' method of the stats/p.adjust > R package and hope you can assist me please. I have attached the data > I used in Excel,<SNIP> In addition to the good advice given to you earlier by Bert Gunter, you should consider the following advice: Don't use Excel!!! This is a corollary of a more general theorem: Don't use Micro$oft!!! cheers, Rolf Turner -- Honorary Research Fellow Department of Statistics University of Auckland Phone: +64-9-373-7599 ext. 88276