Karl Ove Hufthammer
2015-Jul-20 12:59 UTC
[R] Difference between drop1() vs. anova() for Gaussian glm models
Dear list members, I?m having some problems understanding why drop1() and anova() gives different results for *Gaussian* glm models. Here?s a simple example: d = data.frame(x=1:6, group=factor(c(rep("A",2), rep("B", 4)))) l = glm(x~group, data=d) Running the following code gives *three* different p-values. (I would expect it to give two different p-values.) anova(l, test="F") # p = 0.04179 anova(l, test="Chisq") # p = 0.00313 drop1(l, test="Chisq") # p = 0.00841 I?m used to anova() and drop1() giving identical results for the same ?test? argument. However, it looks like the first two tests above use the F- statistic as a test statistic, while the last one uses a ?scaled deviance? statistic: 1-pf(8.7273, 1, 4) # F-statistic 1-pchisq(8.7273, 1) # F-statistic 1-pchisq(6.9447, 1) # Scaled deviance I couldn?t find any documentation on this difference. The help page for drop1() does say: The F tests for the "glm" methods are based on analysis of deviance tests, so if the dispersion is estimated it is based on the residual deviance, unlike the F tests of anova.glm. But here it?s talking about *F* tests. And drop1() with test="F" actually gives the *same* p-value as anova() with test="F": drop1(l, test="F") # p = 0.04179 Any ideas why anova() and drop(1) uses different test statistics for the same ?test? arguments? And why the help page implies (?) that the results should differ for F-tests (while not mentioning chi-squared test), but here they do not (and the chi-squared tests do)? $ sessionInfo() R version 3.2.1 (2015-06-18) Platform: x86_64-suse-linux-gnu (64-bit) Running under: openSUSE 20150714 (Tumbleweed) (x86_64) locale: [1] LC_CTYPE=nn_NO.UTF-8 LC_NUMERIC=C [3] LC_TIME=nn_NO.UTF-8 LC_COLLATE=nn_NO.UTF-8 [5] LC_MONETARY=nn_NO.UTF-8 LC_MESSAGES=nn_NO.UTF-8 [7] LC_PAPER=nn_NO.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=nn_NO.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets [6] methods base loaded via a namespace (and not attached): [1] tools_3.2.1 -- Karl Ove Hufthammer E-mail: karl at huftis.org Jabber: huftis at jabber.no
peter dalgaard
2015-Jul-20 17:03 UTC
[R] Difference between drop1() vs. anova() for Gaussian glm models
I am somewhat surprised that _anything_ sensible comes out of anova.glm(l, test="Chisq"). I think it is mostly expected that you use F tests for that case. What does seem to come out is the same as for drop1(l, test="Rao"), which gives the scaled score test, which would seem to be equivalent to scaled deviance in this case. drop1.glm(l, test="Chisq") appears to be calculating the "real" likelihood ratio test, evaluated in its asymptotic chi-square distribution:> 2*(logLik(l) - logLik(update(l,.~1)))'log Lik.' 6.944717 (df=3) (Apologies for the daft output there... Why does "-" not either subtract the df or unclass the whole thing?) Notice that the scaled tests basically assume that the scale is known, even if it is estimated, so in that sense, the real LRT should be superior. However, in that case it is well known that the asymptotic approximation can be improved by transforming the LRT to the F statistic, whose exact distribution is known. The remaining part of the riddle is why anova.glm doesn't do likelihood differences in the same fashion as drop1.glm. My best guess is that it tries to be consistent with anova.lm and anova.lm tries not to have to refit the sequence of submodels.> On 20 Jul 2015, at 14:59 , Karl Ove Hufthammer <karl at huftis.org> wrote: > > Dear list members, > > I?m having some problems understanding why drop1() and anova() gives > different results for *Gaussian* glm models. Here?s a simple example: > > d = data.frame(x=1:6, > group=factor(c(rep("A",2), rep("B", 4)))) > l = glm(x~group, data=d) > > Running the following code gives *three* different p-values. (I would expect > it to give two different p-values.) > > anova(l, test="F") # p = 0.04179 > anova(l, test="Chisq") # p = 0.00313 > drop1(l, test="Chisq") # p = 0.00841 > > I?m used to anova() and drop1() giving identical results for the same ?test? > argument. However, it looks like the first two tests above use the F- > statistic as a test statistic, while the last one uses a ?scaled deviance? > statistic: > > 1-pf(8.7273, 1, 4) # F-statistic > 1-pchisq(8.7273, 1) # F-statistic > 1-pchisq(6.9447, 1) # Scaled deviance > > I couldn?t find any documentation on this difference. The help page for > drop1() does say: > > The F tests for the "glm" methods are based on analysis of > deviance tests, so if the dispersion is estimated it is based > on the residual deviance, unlike the F tests of anova.glm. > > But here it?s talking about *F* tests. And drop1() with test="F" actually > gives the *same* p-value as anova() with test="F": > > drop1(l, test="F") # p = 0.04179 > > Any ideas why anova() and drop(1) uses different test statistics for the > same ?test? arguments? And why the help page implies (?) that the results > should differ for F-tests (while not mentioning chi-squared test), but here > they do not (and the chi-squared tests do)? > > $ sessionInfo() > R version 3.2.1 (2015-06-18) > Platform: x86_64-suse-linux-gnu (64-bit) > Running under: openSUSE 20150714 (Tumbleweed) (x86_64) > > locale: > [1] LC_CTYPE=nn_NO.UTF-8 LC_NUMERIC=C > [3] LC_TIME=nn_NO.UTF-8 LC_COLLATE=nn_NO.UTF-8 > [5] LC_MONETARY=nn_NO.UTF-8 LC_MESSAGES=nn_NO.UTF-8 > [7] LC_PAPER=nn_NO.UTF-8 LC_NAME=C > [9] LC_ADDRESS=C LC_TELEPHONE=C > [11] LC_MEASUREMENT=nn_NO.UTF-8 LC_IDENTIFICATION=C > > attached base packages: > [1] stats graphics grDevices utils datasets > [6] methods base > > loaded via a namespace (and not attached): > [1] tools_3.2.1 > > -- > Karl Ove Hufthammer > E-mail: karl at huftis.org > Jabber: huftis at jabber.no > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com