thr3ads.net - R help - [R] t.test with Welch correction is ambiguous [Nov 2023]

If this information is useful, please help other people find it:
Share via:

Dr. Rainer Düsing

2023-Nov-27 12:42 UTC

[R] t.test with Welch correction is ambiguous

Dear R Team!



There was an ongoing debate on Research Gate about the ?Welch? option in
your base R t.test command. A user noticed that the correction of the
degrees of freedom labeled as ?Welch Two Sample t-test?, if you choose
var.equal
= TRUE in the R t.test command, differs from the output of the Stata
analysis, which is also labeled as ?Welch's degrees of freedom?. 
Confusingly
enough, the R output coincided with the Stata result labeled as
?Satterthwaite's
degrees of freedom?. Unfortunately, the R documentation wasn?t clear
either, since it lacks any specific reference and the formulation is
ambiguous: ?If TRUE then the pooled variance is used to estimate the
variance otherwise the Welch (or Satterthwaite) approximation to the
degrees of freedom is used." It rather sounds as if both options are
available and not that both authors proposed the same correction separately.

After doing some research and looking into the R code, we found a solution
and would like to suggest an update to the R documentation, to make it more
clear (you can find the similar proposal to the Stata list here:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1734987-unequal-vs-welch-options-for-ttest-why-no-mention-of-welch-1938-in-the-documentation
)

What is called ?Welch Two Sample t-test? in the t.test command refers to
two publications (see links below) with the same correction, namely Welch
(1938) and Satterthwaite (1946). Hence, you also find
"Welch?Satterthwaite"
correction as a description in the literature for this (which is the
aforementioned ?Satterthwaite's degrees of freedom? correction in Stata).
But there is also another correction proposed by Welch (1947), which has
slightly different denominators (see code below), which is called ?Welch's
degrees of freedom? in Stata. This option is not available in R so far.

Therefore, we suggest a) to cite the appropriate references in the
documentation (at least Welch (1938) and Satterthwaite (1946)), b) adapt
the output to something like ?Welch-Satterthwaite adjusted Two Sample
t-test? and maybe c) to incorporate the third option for the Welch (1947)
adjustment, where the Welch-Satterthwaite correction should be the default
option (Aspin & Welch, 1949). Code proposal below for the df correction.

Best wishes,
Rainer D?sing



   1. ?  https://www.jstor.org/stable/2332010
  
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332010?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1938)
   2. ?  https://www.jstor.org/stable/3002019
  
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F3002019?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Satterthwaite, 1946)
   3. ?  https://www.jstor.org/stable/2332510
  
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332510?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1947)
   4. ?  Aspin, Alice A., and B. L. Welch. ?Tables for Use in Comparisons
   Whose Accuracy Involves Two Variances, Separately Estimated.?
   *Biometrika* 36, no. 3/4 (1949): 290?96. https://doi.org/10.2307/2332668
  
<https://www.researchgate.net/deref/https%3A%2F%2Fdoi.org%2F10.2307%2F2332668?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>.
   [see point 4 in the Appendix by Welch]



var.equal = "yes"

var.equal = "Welch"

var.equal = "W-S"



vx <- var(x)

nx <- length(x)

vy <- var(y)

ny <- length(y)



if (var.equal == "yes") {

  df <- nx + ny - 2

  v <- 0

  if (nx > 1)

    v <- v + (nx - 1) * vx

  if (ny > 1)

    v <- v + (ny - 1) * vy

  v <- v/df

  stderr <- sqrt(v * (1/nx + 1/ny))

} else if (var.equal == "Welch") {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- -2+(stderr^4/(stderrx^4/(nx + 1) + stderry^4/(ny +1)))

} else {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- stderr^4/(stderrx^4/(nx - 1) + stderry^4/(ny -1))

}


-- 
*Dr. rer. nat. Rainer D?sing, Dipl.-Psych. *
Universit?t Osnabr?ck
Institut f?r Psychologie
Fachgebiet Forschungsmethodik, Diagnostik und Evaluation
Lise-Meitner-Str. 3
49076 Osnabr?ck

Raum 75/222
Tel: +49-541 969 7734
Email: raduesing at uos.de <rduesing at uos.de>

	[[alternative HTML version deleted]]

Ebert,Timothy Aaron

2023-Nov-27 15:22 UTC

head link

[R] t.test with Welch correction is ambiguous

Your solution was educational. Thank you. I have two comments.
1) If you do not provide both options then you are forcing people to conform to
your approach. In general I disapprove, but for specific cases I can see
advantages.
2) Without reading the relevant papers (and possibly understanding them) is
there a simple metric that would enable the correct choice between
Welch-Shatterthwaite and Welch (1947)?
3) If there is a broad consensus that Welch (1947) is never the correct option
then do not implement it.

As written, it sounds like Welch (1938) proposed a correction. Welch published
another correction in 1947, but then retracted his 1947 correction in a 1949
paper. At least that is how I interpret what was written in your option c.

Tim
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Dr. Rainer
D?sing
Sent: Monday, November 27, 2023 7:42 AM
To: r-help at r-project.org
Subject: [R] t.test with Welch correction is ambiguous

[External Email]

Dear R Team!



There was an ongoing debate on Research Gate about the "Welch" option
in your base R t.test command. A user noticed that the correction of the degrees
of freedom labeled as "Welch Two Sample t-test", if you choose
var.equal = TRUE in the R t.test command, differs from the output of the Stata
analysis, which is also labeled as "Welch's degrees of freedom". 
Confusingly enough, the R output coincided with the Stata result labeled as
"Satterthwaite's degrees of freedom". Unfortunately, the R
documentation wasn't clear either, since it lacks any specific reference and
the formulation is
ambiguous: "If TRUE then the pooled variance is used to estimate the
variance otherwise the Welch (or Satterthwaite) approximation to the degrees of
freedom is used." It rather sounds as if both options are available and not
that both authors proposed the same correction separately.

After doing some research and looking into the R code, we found a solution and
would like to suggest an update to the R documentation, to make it more clear
(you can find the similar proposal to the Stata list here:
https://www.statalist.org/forums/forum/general-stata-discussion/general/1734987-unequal-vs-welch-options-for-ttest-why-no-mention-of-welch-1938-in-the-documentation
)

What is called "Welch Two Sample t-test" in the t.test command refers
to two publications (see links below) with the same correction, namely Welch
(1938) and Satterthwaite (1946). Hence, you also find
"Welch-Satterthwaite"
correction as a description in the literature for this (which is the
aforementioned "Satterthwaite's degrees of freedom" correction in
Stata).
But there is also another correction proposed by Welch (1947), which has
slightly different denominators (see code below), which is called
"Welch's degrees of freedom" in Stata. This option is not
available in R so far.

Therefore, we suggest a) to cite the appropriate references in the documentation
(at least Welch (1938) and Satterthwaite (1946)), b) adapt the output to
something like "Welch-Satterthwaite adjusted Two Sample t-test" and
maybe c) to incorporate the third option for the Welch (1947) adjustment, where
the Welch-Satterthwaite correction should be the default option (Aspin &
Welch, 1949). Code proposal below for the df correction.

Best wishes,
Rainer D?sing



   1. ?  https://www.jstor.org/stable/2332010
  
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332010?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1938)
   2. ?  https://www.jstor.org/stable/3002019
  
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F3002019?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Satterthwaite, 1946)
   3. ?  https://www.jstor.org/stable/2332510
  
<https://www.researchgate.net/deref/https%3A%2F%2Fwww.jstor.org%2Fstable%2F2332510?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>
   (Welch, 1947)
   4. ?  Aspin, Alice A., and B. L. Welch. "Tables for Use in Comparisons
   Whose Accuracy Involves Two Variances, Separately Estimated."
   *Biometrika* 36, no. 3/4 (1949): 290-96. https://doi.org/10.2307/2332668
  
<https://www.researchgate.net/deref/https%3A%2F%2Fdoi.org%2F10.2307%2F2332668?_tp=eyJjb250ZXh0Ijp7ImZpcnN0UGFnZSI6InF1ZXN0aW9uIiwicGFnZSI6InF1ZXN0aW9uIiwicG9zaXRpb24iOiJwYWdlQ29udGVudCJ9fQ>.
   [see point 4 in the Appendix by Welch]



var.equal = "yes"

var.equal = "Welch"

var.equal = "W-S"



vx <- var(x)

nx <- length(x)

vy <- var(y)

ny <- length(y)



if (var.equal == "yes") {

  df <- nx + ny - 2

  v <- 0

  if (nx > 1)

    v <- v + (nx - 1) * vx

  if (ny > 1)

    v <- v + (ny - 1) * vy

  v <- v/df

  stderr <- sqrt(v * (1/nx + 1/ny))

} else if (var.equal == "Welch") {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- -2+(stderr^4/(stderrx^4/(nx + 1) + stderry^4/(ny +1)))

} else {

  stderrx <- sqrt(vx/nx)

  stderry <- sqrt(vy/ny)

  stderr <- sqrt(stderrx^2 + stderry^2)

  df <- stderr^4/(stderrx^4/(nx - 1) + stderry^4/(ny -1))

}


--
*Dr. rer. nat. Rainer D?sing, Dipl.-Psych. * Universit?t Osnabr?ck Institut f?r
Psychologie Fachgebiet Forschungsmethodik, Diagnostik und Evaluation
Lise-Meitner-Str. 3
49076 Osnabr?ck

Raum 75/222
Tel: +49-541 969 7734
Email: raduesing at uos.de <rduesing at uos.de>

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R help - Nov 2023 - t.test with Welch correction is ambiguous

[R] t.test with Welch correction is ambiguous

[R] t.test with Welch correction is ambiguous