thr3ads.net - R help - [R] [EXT] Re: A very small p-value [Nov 2025]

If this information is useful, please help other people find it:
Share via:

Viechtbauer, Wolfgang (NP)

2025-Nov-05 14:04 UTC

[R] [EXT] Re: A very small p-value

Eik, thanks for posting this. I thought that the page was making the usual (just
somewhat flawed) argument that once the dfs are sufficiently large, whether one
does pnorm(...) or pt(..., df=<>) makes little difference (although far
out in the tails it still does).

Your post made me look at the page and I hope nobody takes anything written
there serious. The argument is so utterly wrong. I am absolutely flabbergasted
how somebody could write so many pages of text based on such a flawed
understanding of basic statistical concepts.

Just to give some examples:

"The next issue I have is that I can't see the underlying data. So I
don't know what the actual shape of the distribution is, but it's
probably fair to say it's normally distributed (assuming the Central Limit
Theorem applies)." The CLT says nothing about the distribution of the raw
data.

"As the sample size increases, samples will begin to operate and appear
more and more like the population they are drawn from. This is the Law of Large
Numbers." The law of large numbers has nothing to do with this.

And as Eik already pointed out, the 'z-test' the author is describing is
not a test at all, but essentially just calculates the standardized mean
difference (and computing a p-value from it makes no sense).

Best,
Wolfgang
> -----Original Message-----
> From: R-help <r-help-bounces at r-project.org> On Behalf Of Eik
Vettorazzi via R-
> help
> Sent: Tuesday, November 4, 2025 20:44
> To: Petr Pikal <peprcon.asc at centrum.cz>; Christophe Dutang
<dutangc at gmail.com>
> Cc: r-help at r-project.org
> Subject: Re: [R] [EXT] Re: A very small p-value
>
> Hi,
> Stepping briefly outside the R context, I noticed a statistical point in
> the text you linked that, in my opinion, isn't quite right. I believe
> there's a key misunderstanding here: The statement that the z-test does
> not depend on the number of cases is incorrect. The p-value of the
> z-test is ?just like other tests? very much dependent on the sample
> size, assuming the same mean difference and standard deviation.
> The text you linked is actually calculating an Effect Size, which is
> (largely) independent of the sample size. Effect Size answers the
> question of how "relevant" or "large" the difference
between groups is.
> This is fundamentally different from testing for "significant"
differences.
> Specifically, the crucial 1/\sqrt{n} term, which is necessary for
> calculating the standard error of the mean difference, seems to be
> missing from the presented formula for the z-score. I just wanted to
> quickly point this out.
>
> Best regards
>
> Am 27.10.2025 um 14:12 schrieb Petr Pikal:
> > Hallo
> >
> > The t test is probably not the best option in your case. With 95
> > observations your data behave more like a population and you  may get
> > better insight using z-test. See
> >
https://toxictruthblog.com/avoiding-little-known-problems-with-the-t-test/
> >
> > Best regards.
> > Petr
> >
> > so 25. 10. 2025 v 11:46 odes?latel Christophe Dutang <dutangc at
gmail.com>
> > napsal:
> >
> >> Dear list,
> >>
> >> I'm computing a p-value for the Student test and discover some
> >> inconsistencies with the cdf pt().
> >>
> >> The observed statistic is 11.23995 for 95 observations, so the
p-value is
> >> very small
> >>
> >>> t_score <- 11.23995
> >>> n <- 95
> >>> print(pt(t_score, df = n-2, lower=FALSE), digits=22)
> >> [1] 2.539746620181247991746e-19
> >>> print(integrate(dt, lower=t_score, upper=Inf, df=n-2)$value,
digits = 22)
> >> [1] 2.539746631161970791961e-19
> >>
> >> But if I compute with pt(lower=TRUE), I got 0
> >>
> >>> print(1-pt(t_score, df = n-2, lower=TRUE), digits=22)
> >> [1] 0
> >>
> >> Indeed, the p-value is lower than the epsilon machine
> >>
> >>> pt(t_score, df = n-2, lower=FALSE) < .Machine$double.eps
> >> [1] TRUE
> >>
> >> Using the square of t statistic which follows a Fisher
distribution, I got
> >> the same issue:
> >>
> >>> print(pf(z, 1, n-2, lower=FALSE), digits=22)
> >> [1] 5.079493240362495983491e-19
> >>> print(integrate(df, lower=z, upper=Inf, df1=1, df2=n-2)$value,
digits > >> 22)
> >> [1] 5.079015231299358486828e-19
> >>> print(1-pf(z, 1, n-2, lower=TRUE), digits=22)
> >> [1] 0
> >>
> >> When using the t.test() function, the p-value is naturally printed
:
> >> p-value < 2.2e-16.
> >>
> >> Any comment is welcome.
> >>
> >> Christophe

Robert Knight

2025-Nov-05 17:25 UTC

head link

[R] [EXT] Re: A very small p-value

I have not reviewed the formulas presented, but you err in assertion of
wrong information om that site.  The Central Limit Theorem is about the raw
data and the law of large numbers applies to averages obtained from it.

T if n<30 was the metric I was trained to use.  If I recall correctly, z is
more reliable below 30 than t is above it.

Unless the math formulas are wrong, that site seems useful.  I wonder how
far t diverges from z, but do not have time to compare them exactly.




On Wed, Nov 5, 2025, 9:04?AM Viechtbauer, Wolfgang (NP) via R-help <
r-help at r-project.org> wrote:
> Eik, thanks for posting this. I thought that the page was making the usual
> (just somewhat flawed) argument that once the dfs are sufficiently large,
> whether one does pnorm(...) or pt(..., df=<>) makes little difference
> (although far out in the tails it still does).
>
> Your post made me look at the page and I hope nobody takes anything
> written there serious. The argument is so utterly wrong. I am absolutely
> flabbergasted how somebody could write so many pages of text based on such
> a flawed understanding of basic statistical concepts.
>
> Just to give some examples:
>
> "The next issue I have is that I can't see the underlying data. So
I don't
> know what the actual shape of the distribution is, but it's probably
fair
> to say it's normally distributed (assuming the Central Limit Theorem
> applies)." The CLT says nothing about the distribution of the raw
data.
>
> "As the sample size increases, samples will begin to operate and
appear
> more and more like the population they are drawn from. This is the Law of
> Large Numbers." The law of large numbers has nothing to do with this.
>
> And as Eik already pointed out, the 'z-test' the author is
describing is
> not a test at all, but essentially just calculates the standardized mean
> difference (and computing a p-value from it makes no sense).
>
> Best,
> Wolfgang
>
> > -----Original Message-----
> > From: R-help <r-help-bounces at r-project.org> On Behalf Of Eik
Vettorazzi
> via R-
> > help
> > Sent: Tuesday, November 4, 2025 20:44
> > To: Petr Pikal <peprcon.asc at centrum.cz>; Christophe Dutang
<
> dutangc at gmail.com>
> > Cc: r-help at r-project.org
> > Subject: Re: [R] [EXT] Re: A very small p-value
> >
> > Hi,
> > Stepping briefly outside the R context, I noticed a statistical point
in
> > the text you linked that, in my opinion, isn't quite right. I
believe
> > there's a key misunderstanding here: The statement that the z-test
does
> > not depend on the number of cases is incorrect. The p-value of the
> > z-test is ?just like other tests? very much dependent on the sample
> > size, assuming the same mean difference and standard deviation.
> > The text you linked is actually calculating an Effect Size, which is
> > (largely) independent of the sample size. Effect Size answers the
> > question of how "relevant" or "large" the
difference between groups is.
> > This is fundamentally different from testing for
"significant"
> differences.
> > Specifically, the crucial 1/\sqrt{n} term, which is necessary for
> > calculating the standard error of the mean difference, seems to be
> > missing from the presented formula for the z-score. I just wanted to
> > quickly point this out.
> >
> > Best regards
> >
> > Am 27.10.2025 um 14:12 schrieb Petr Pikal:
> > > Hallo
> > >
> > > The t test is probably not the best option in your case. With 95
> > > observations your data behave more like a population and you  may
get
> > > better insight using z-test. See
> > >
> https://toxictruthblog.com/avoiding-little-known-problems-with-the-t-test/
> > >
> > > Best regards.
> > > Petr
> > >
> > > so 25. 10. 2025 v 11:46 odes?latel Christophe Dutang <
> dutangc at gmail.com>
> > > napsal:
> > >
> > >> Dear list,
> > >>
> > >> I'm computing a p-value for the Student test and discover
some
> > >> inconsistencies with the cdf pt().
> > >>
> > >> The observed statistic is 11.23995 for 95 observations, so
the
> p-value is
> > >> very small
> > >>
> > >>> t_score <- 11.23995
> > >>> n <- 95
> > >>> print(pt(t_score, df = n-2, lower=FALSE), digits=22)
> > >> [1] 2.539746620181247991746e-19
> > >>> print(integrate(dt, lower=t_score, upper=Inf,
df=n-2)$value, digits
> = 22)
> > >> [1] 2.539746631161970791961e-19
> > >>
> > >> But if I compute with pt(lower=TRUE), I got 0
> > >>
> > >>> print(1-pt(t_score, df = n-2, lower=TRUE), digits=22)
> > >> [1] 0
> > >>
> > >> Indeed, the p-value is lower than the epsilon machine
> > >>
> > >>> pt(t_score, df = n-2, lower=FALSE) <
.Machine$double.eps
> > >> [1] TRUE
> > >>
> > >> Using the square of t statistic which follows a Fisher
distribution,
> I got
> > >> the same issue:
> > >>
> > >>> print(pf(z, 1, n-2, lower=FALSE), digits=22)
> > >> [1] 5.079493240362495983491e-19
> > >>> print(integrate(df, lower=z, upper=Inf, df1=1,
df2=n-2)$value,
> digits > > >> 22)
> > >> [1] 5.079015231299358486828e-19
> > >>> print(1-pf(z, 1, n-2, lower=TRUE), digits=22)
> > >> [1] 0
> > >>
> > >> When using the t.test() function, the p-value is naturally
printed :
> > >> p-value < 2.2e-16.
> > >>
> > >> Any comment is welcome.
> > >>
> > >> Christophe
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Robert Knight

2025-Nov-05 17:39 UTC

head link

[R] [EXT] Re: A very small p-value

The BMJ guide used to include Z scores.  They have removed that section on
their live website, but the math has not changed.  Z scores make incredible
sense to use.

Search up older versions of Statistics at square one.  I have an older one
saved with all the sections.  BMJ had in depth guides on them, and that may
be where that blog got the idea.

One reason to use it is because it is often easier to calculate using the
base software packages rand functions rather than needing to import
additional libraries.

Functional programming is easier with Z than t.

On Wed, Nov 5, 2025, 9:04?AM Viechtbauer, Wolfgang (NP) via R-help <
r-help at r-project.org> wrote:
> Eik, thanks for posting this. I thought that the page was making the usual
> (just somewhat flawed) argument that once the dfs are sufficiently large,
> whether one does pnorm(...) or pt(..., df=<>) makes little difference
> (although far out in the tails it still does).
>
> Your post made me look at the page and I hope nobody takes anything
> written there serious. The argument is so utterly wrong. I am absolutely
> flabbergasted how somebody could write so many pages of text based on such
> a flawed understanding of basic statistical concepts.
>
> Just to give some examples:
>
> "The next issue I have is that I can't see the underlying data. So
I don't
> know what the actual shape of the distribution is, but it's probably
fair
> to say it's normally distributed (assuming the Central Limit Theorem
> applies)." The CLT says nothing about the distribution of the raw
data.
>
> "As the sample size increases, samples will begin to operate and
appear
> more and more like the population they are drawn from. This is the Law of
> Large Numbers." The law of large numbers has nothing to do with this.
>
> And as Eik already pointed out, the 'z-test' the author is
describing is
> not a test at all, but essentially just calculates the standardized mean
> difference (and computing a p-value from it makes no sense).
>
> Best,
> Wolfgang
>
> > -----Original Message-----
> > From: R-help <r-help-bounces at r-project.org> On Behalf Of Eik
Vettorazzi
> via R-
> > help
> > Sent: Tuesday, November 4, 2025 20:44
> > To: Petr Pikal <peprcon.asc at centrum.cz>; Christophe Dutang
<
> dutangc at gmail.com>
> > Cc: r-help at r-project.org
> > Subject: Re: [R] [EXT] Re: A very small p-value
> >
> > Hi,
> > Stepping briefly outside the R context, I noticed a statistical point
in
> > the text you linked that, in my opinion, isn't quite right. I
believe
> > there's a key misunderstanding here: The statement that the z-test
does
> > not depend on the number of cases is incorrect. The p-value of the
> > z-test is ?just like other tests? very much dependent on the sample
> > size, assuming the same mean difference and standard deviation.
> > The text you linked is actually calculating an Effect Size, which is
> > (largely) independent of the sample size. Effect Size answers the
> > question of how "relevant" or "large" the
difference between groups is.
> > This is fundamentally different from testing for
"significant"
> differences.
> > Specifically, the crucial 1/\sqrt{n} term, which is necessary for
> > calculating the standard error of the mean difference, seems to be
> > missing from the presented formula for the z-score. I just wanted to
> > quickly point this out.
> >
> > Best regards
> >
> > Am 27.10.2025 um 14:12 schrieb Petr Pikal:
> > > Hallo
> > >
> > > The t test is probably not the best option in your case. With 95
> > > observations your data behave more like a population and you  may
get
> > > better insight using z-test. See
> > >
> https://toxictruthblog.com/avoiding-little-known-problems-with-the-t-test/
> > >
> > > Best regards.
> > > Petr
> > >
> > > so 25. 10. 2025 v 11:46 odes?latel Christophe Dutang <
> dutangc at gmail.com>
> > > napsal:
> > >
> > >> Dear list,
> > >>
> > >> I'm computing a p-value for the Student test and discover
some
> > >> inconsistencies with the cdf pt().
> > >>
> > >> The observed statistic is 11.23995 for 95 observations, so
the
> p-value is
> > >> very small
> > >>
> > >>> t_score <- 11.23995
> > >>> n <- 95
> > >>> print(pt(t_score, df = n-2, lower=FALSE), digits=22)
> > >> [1] 2.539746620181247991746e-19
> > >>> print(integrate(dt, lower=t_score, upper=Inf,
df=n-2)$value, digits
> = 22)
> > >> [1] 2.539746631161970791961e-19
> > >>
> > >> But if I compute with pt(lower=TRUE), I got 0
> > >>
> > >>> print(1-pt(t_score, df = n-2, lower=TRUE), digits=22)
> > >> [1] 0
> > >>
> > >> Indeed, the p-value is lower than the epsilon machine
> > >>
> > >>> pt(t_score, df = n-2, lower=FALSE) <
.Machine$double.eps
> > >> [1] TRUE
> > >>
> > >> Using the square of t statistic which follows a Fisher
distribution,
> I got
> > >> the same issue:
> > >>
> > >>> print(pf(z, 1, n-2, lower=FALSE), digits=22)
> > >> [1] 5.079493240362495983491e-19
> > >>> print(integrate(df, lower=z, upper=Inf, df1=1,
df2=n-2)$value,
> digits > > >> 22)
> > >> [1] 5.079015231299358486828e-19
> > >>> print(1-pf(z, 1, n-2, lower=TRUE), digits=22)
> > >> [1] 0
> > >>
> > >> When using the t.test() function, the p-value is naturally
printed :
> > >> p-value < 2.2e-16.
> > >>
> > >> Any comment is welcome.
> > >>
> > >> Christophe
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> https://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Nov 2025 - [EXT] Re: A very small p-value

[R] [EXT] Re: A very small p-value

[R] [EXT] Re: A very small p-value

[R] [EXT] Re: A very small p-value