thr3ads.net - R help - [R] lm equivalent of Welch-corrected t-test? [Nov 2018]

If this information is useful, please help other people find it:
Share via:

Paul Johnson

2018-Nov-13 15:19 UTC

[R] lm equivalent of Welch-corrected t-test?

Long ago, when R's t.test had var.equal=TRUE by default, I wrote some
class notes showing that the result was equivalent to a one predictor
regression model.  Because t.test does not default to var.equal=TRUE
these days, I'm curious to know if there is a way to specify weights
in an lm to obtain the same result as the Welch-adjusted values
reported by t.test at the current time.  Is there a WLS equivalent
adjustment with lm?

Here's example code to show that lm is same as t.test with var.equal.
The signs come out differently, but otherwise the effect estimate,
standard error, t value are same:


set.seed(234234)
dat <- data.frame(x = gl(2, 50, labels = c("F", "M")))
dat$err <- rnorm(100, 0, 1)
dat$y <- ifelse(dat$x == "F", 40 + dat$err, 44 + dat$err)
m1 <- lm(y ~ x, dat)
summary(m1)
m1.t <- t.test(y ~ x, dat, var.equal = TRUE)
m1.t
## diff matches regression coefficient
(m1.t.effect <- diff(m1.t$estimate))
## standard error matches regression se
m1.t.stderr <- m1.t.effect/m1.t$statistic

If you run that, you see lm output:

Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)  39.9456     0.1180  338.65   <2e-16 ***
xM            3.9080     0.1668   23.43   <2e-16 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Residual standard error: 0.8341 on 98 degrees of freedom
Multiple R-squared:  0.8485,    Adjusted R-squared:  0.8469
F-statistic: 548.8 on 1 and 98 DF,  p-value: < 2.2e-16

and t.test:
> m1.t <- t.test(y ~ x, dat, var.equal = TRUE)
> m1.t
    Two Sample t-test

data:  y by x
t = -23.427, df = 98, p-value < 2.2e-16
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -4.239038 -3.576968
sample estimates:
mean in group F mean in group M
       39.94558        43.85358
> (m1.t.effect <- diff(m1.t$estimate))mean in group M
       3.908003> m1.t.effect/m1.t$statisticmean in group M
     -0.1668129




-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

To write to me directly, please address me at pauljohn at ku.edu.

peter dalgaard

2018-Nov-14 12:56 UTC

head link

[R] lm equivalent of Welch-corrected t-test?

> On 13 Nov 2018, at 16:19 , Paul Johnson <pauljohn32 at gmail.com>
wrote:
> 
> Long ago, when R's t.test had var.equal=TRUE by default, I wrote some
> class notes showing that the result was equivalent to a one predictor
> regression model.  Because t.test does not default to var.equal=TRUE
> these days, I'm curious to know if there is a way to specify weights
> in an lm to obtain the same result as the Welch-adjusted values
> reported by t.test at the current time.  Is there a WLS equivalent
> adjustment with lm?
> 
The short answer is no. The long answer is to look into heteroscedasticity
adjustments or lmer combined with pbkrtest.

Well, you can do the weights in lm() and get the right t statistic, but not the
Satterthwaite oddball-df thing (the sleep data are actually paired, but ignore
that here)

variances <- tapply(sleep$extra, sleep$group, var)
sleep$wt <- 1/variances[sleep$group]
t.test(extra~group, sleep)
summary(lm(extra~factor(group), weight=wt, sleep))


The pbkrtest approach goes like this:

library(lme4)
library(pbkrtest)
sleep$gdummy <- sleep$group-1 # arrange that this is 1 in the group with
larger variance
fit1 <-  lmer(extra ~ group + (gdummy+0 | ID), sleep)
fit0 <-  lmer(extra ~ 1 + (gdummy+0 | ID), sleep)
KRmodcomp(fit0, fit1)

(This somewhat abuses the existing sleep$ID -- all you really need is something
that has a level for each record where gdummy==1)

This ends up with 

         stat     ndf     ddf F.scaling p.value  
Ftest  3.4626  1.0000 17.7765         1 0.07939 .

to be compared with the Welch test

t = -1.8608, df = 17.776, p-value = 0.07939


-pd 

> Here's example code to show that lm is same as t.test with var.equal.
> The signs come out differently, but otherwise the effect estimate,
> standard error, t value are same:
> 
> 
> set.seed(234234)
> dat <- data.frame(x = gl(2, 50, labels = c("F",
"M")))
> dat$err <- rnorm(100, 0, 1)
> dat$y <- ifelse(dat$x == "F", 40 + dat$err, 44 + dat$err)
> m1 <- lm(y ~ x, dat)
> summary(m1)
> m1.t <- t.test(y ~ x, dat, var.equal = TRUE)
> m1.t
> ## diff matches regression coefficient
> (m1.t.effect <- diff(m1.t$estimate))
> ## standard error matches regression se
> m1.t.stderr <- m1.t.effect/m1.t$statistic
> 
> If you run that, you see lm output:
> 
> Coefficients:
>            Estimate Std. Error t value Pr(>|t|)
> (Intercept)  39.9456     0.1180  338.65   <2e-16 ***
> xM            3.9080     0.1668   23.43   <2e-16 ***
> ---
> Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
> 
> Residual standard error: 0.8341 on 98 degrees of freedom
> Multiple R-squared:  0.8485,    Adjusted R-squared:  0.8469
> F-statistic: 548.8 on 1 and 98 DF,  p-value: < 2.2e-16
> 
> and t.test:
> 
>> m1.t <- t.test(y ~ x, dat, var.equal = TRUE)
>> m1.t
> 
>    Two Sample t-test
> 
> data:  y by x
> t = -23.427, df = 98, p-value < 2.2e-16
> alternative hypothesis: true difference in means is not equal to 0
> 95 percent confidence interval:
> -4.239038 -3.576968
> sample estimates:
> mean in group F mean in group M
>       39.94558        43.85358
> 
>> (m1.t.effect <- diff(m1.t$estimate))
> mean in group M
>       3.908003
>> m1.t.effect/m1.t$statistic
> mean in group M
>     -0.1668129
> 
> 
> 
> 
> -- 
> Paul E. Johnson   http://pj.freefaculty.org
> Director, Center for Research Methods and Data Analysis http://crmda.ku.edu
> 
> To write to me directly, please address me at pauljohn at ku.edu.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com

R help - Nov 2018 - lm equivalent of Welch-corrected t-test?

[R] lm equivalent of Welch-corrected t-test?

[R] lm equivalent of Welch-corrected t-test?