Maya Pfaff
2009-Nov-02 13:40 UTC
[R] Interaction contrasts or posthoc test for glm (MASS) with ANOVA design
Dear R experts I am running a negative-binomial GLM (glm.nb) to test the null hypotheses that species 1 and 2 are equally abundant between site 1 and site2, and between each other. So, I have a 2x2 factorial design with factors Site (1,2) and Taxon (1,2). Since the Site:Taxon interaction is significant, I need to do the equivalent to a "post-hoc test" for ANOVA, however, the same tests (e.g. Tukey HSD) do not seem to be applicable for the GLM. I tried specifying orthogonal contrasts, but could not figure out what the interaction contrast (see Site1:Taxon1 in below example) means. Could you please advise me how to specify a meaningful interaction contrast (i.e. contrast species within sites)? Alternatively, could you recommend a way to do posthoc comparisons? Thanks for your time and kind regards Maya> library(MASS) > counts <- c(1, 4, 9, 2, 1, 4, 2, 4, 1, 3, 2, 2, 1, 3, 1, 1, 2, 1, 113, 83,49, 46, 13, 52, 4, 10, 14, 10, 3, 19, 8, 21, 151, 186, 99, 11, 29, 24, 24, 62, 15, 98, 30, 21, 63, 29, 48, 11, 16, 35, 21, 17, 6, 2, 2, 3, 3, 4, 4, 2, 1, 2, 2, 3, 4, 8, 10, 3, 14, 3, 11, 23, 3, 51, 8, 8, 7, 1, 13, 8, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 5, 8, 1, 1, 20, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 3, 0, 1, 0, 5, 1, 1, 9, 0, 34, 4, 1, 17, 0, 7, 33, 86, 73, 67, 79, 109, 27, 37, 23, 12, 17, 41, 8, 38, 4, 23, 14, 49, 64, 39, 31, 156, 110, 97, 33, 170, 137, 72, 28, 54)> Site <- factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2))> Taxon <- factor(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1))> contrasts(Site) <- cbind(c(1,-1)) > contrasts(Taxon) <- cbind(c(1,-1)) > s1 <- glm.nb(counts ~ Site*Taxon) > summary(s1)Call: glm.nb(formula = counts ~ Site * Taxon, init.theta = 0.612672617555492, link = log) Deviance Residuals: Min 1Q Median 3Q Max -2.269021 -1.215727 -0.454719 0.003515 2.517288 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 2.6139 0.1097 23.831 < 2e-16 *** Site1 -0.8664 0.1097 -7.899 2.81e-15 *** Taxon1 -0.3814 0.1097 -3.477 0.000506 *** Site1:Taxon1 -0.5977 0.1097 -5.449 5.06e-08 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for Negative Binomial(0.6127) family taken to be 1) Null deviance: 242.94 on 153 degrees of freedom Residual deviance: 176.20 on 150 degrees of freedom AIC: 1165.4 Number of Fisher Scoring iterations: 1 Theta: 0.6127 Std. Err.: 0.0690 2 x log-likelihood: -1155.4010> > TukeyHSD(s1)Error in UseMethod("TukeyHSD") : no applicable method for "TukeyHSD" [[alternative HTML version deleted]]
David Winsemius
2009-Nov-02 15:39 UTC
[R] Interaction contrasts or posthoc test for glm (MASS) with ANOVA design
On Nov 2, 2009, at 8:40 AM, Maya Pfaff wrote:> Dear R experts > > I am running a negative-binomial GLM (glm.nb) to test the null > hypotheses > that species 1 and 2 are equally abundant between site 1 and site2, > and > between each other. So, I have a 2x2 factorial design with factors > Site > (1,2) and Taxon (1,2). > Since the Site:Taxon interaction is significant, I need to do the > equivalent > to a "post-hoc test" for ANOVA, however, the same tests (e.g. Tukey > HSD) do > not seem to be applicable for the GLM. I tried specifying orthogonal > contrasts, but could not figure out what the interaction contrast (see > Site1:Taxon1 in below example) means.I'm a bit puzzled by this request, since it appears that you already have the test result you seek in the form of the line beginning Site1:Taxon1. (It might be a different story if you had more species.) In the default glm setup, the contrasts are of type "treatment". Your Site1:Taxon1 coefficients would then be the mean difference (and se) from the estimates based on Intercept+Taxon+Site( on the scale being regressed on). You will notice that the z value does not change when you use treatment contrasts instead of those you specified. The only thing further to do would be to construct a reduced model without the interaction, take the difference in deviances of the model, and compare to chi-sq(significance level, df=1) and that would give you the ML test rather than the score test which results from the by coefficient tests. When I do that I get s1: 2 x log-likelihood: -1155.4010 s2: 2 x log-likelihood: -1181.8600 so > 1-pchisq(26.459,df=1) [1] 2.691913e-07 (Which is immaterially different than the Pr(>|z|) that you get from the default output. Of course that was the way we did it in school 20 years ago and it would be much faster to do: > anova(s1,s2) Likelihood ratio tests of Negative Binomial Models Response: counts Model theta Resid. df 2 x log-lik. Test df LR stat. Pr(Chi) 1 Site + Taxon 0.5263897 151 -1181.860 2 Site * Taxon 0.6126726 150 -1155.401 1 vs 2 1 26.45960 2.691083e-07 -- David> Could you please advise me how to specify a meaningful interaction > contrast > (i.e. contrast species within sites)? Alternatively, could you > recommend a > way to do posthoc comparisons? > > Thanks for your time and kind regards > Maya > >> library(MASS) >> counts <- c(1, 4, 9, 2, 1, 4, 2, 4, 1, 3, 2, 2, 1, 3, 1, 1, 2, 1, >> 113, 83, > 49, 46, 13, 52, 4, 10, 14, 10, 3, 19, 8, 21, 151, 186, 99, 11, 29, > 24, 24, > 62, 15, 98, 30, 21, 63, 29, 48, 11, 16, 35, 21, 17, 6, 2, 2, 3, 3, > 4, 4, 2, > 1, 2, 2, 3, 4, 8, 10, 3, 14, 3, 11, 23, 3, 51, 8, 8, 7, 1, 13, 8, 1, > 0, 1, > 0, 1, 1, 1, 0, 1, 1, 1, 5, 8, 1, 1, 20, 2, 0, 0, 0, 1, 0, 0, 0, 0, > 0, 1, 0, > 0, 3, 0, 1, 0, 5, 1, 1, 9, 0, 34, 4, 1, 17, 0, 7, 33, 86, 73, 67, > 79, 109, > 27, 37, 23, 12, 17, 41, 8, 38, 4, 23, 14, 49, 64, 39, 31, 156, 110, > 97, 33, > 170, 137, 72, 28, 54) >> Site <- factor(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, >> 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)) >> Taxon <- factor(c(2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, >> 2, 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, > 2, 2, > 2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)) > >> contrasts(Site) <- cbind(c(1,-1)) >> contrasts(Taxon) <- cbind(c(1,-1)) >> s1 <- glm.nb(counts ~ Site*Taxon) >> summary(s1) > > Call: > glm.nb(formula = counts ~ Site * Taxon, init.theta = > 0.612672617555492, > link = log) > > Deviance Residuals: > Min 1Q Median 3Q Max > -2.269021 -1.215727 -0.454719 0.003515 2.517288 > > Coefficients: > Estimate Std. Error z value Pr(>|z|) > (Intercept) 2.6139 0.1097 23.831 < 2e-16 *** > Site1 -0.8664 0.1097 -7.899 2.81e-15 *** > Taxon1 -0.3814 0.1097 -3.477 0.000506 *** > Site1:Taxon1 -0.5977 0.1097 -5.449 5.06e-08 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > (Dispersion parameter for Negative Binomial(0.6127) family taken to > be 1) > > Null deviance: 242.94 on 153 degrees of freedom > Residual deviance: 176.20 on 150 degrees of freedom > AIC: 1165.4 > > Number of Fisher Scoring iterations: 1 > > > Theta: 0.6127 > Std. Err.: 0.0690 > > 2 x log-likelihood: -1155.4010 >> >> TukeyHSD(s1) > Error in UseMethod("TukeyHSD") : no applicable method for "TukeyHSD" > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT