Leslie Young
2010-Sep-14 15:43 UTC
[R] Model averaging with (and without) interaction terms
I?ve used logistic regression to create models to assess the effect of 3 variables on the presence or absence of a species, including the interaction terms between variables and model averaging using MuMI: model.avg The top models (delta<4) include several models with interaction terms and some models without; model weights are quite low for all models (<0.25). My problem is that the models with interactions have negative coefficients on the variables with a positive interaction term whereas the same model without an interaction has positive coefficients. MuMIn: model.avg averages all these models together, so the relationship is washed out (CI overlaps 0). Eg. mod1<-glm(presence ~ x1*x2, family=?binomial?) coefficients: -0.661 x1, -0.043 x2, 0.02 x1:x2 mod2 <- glm(presence ~ x1 + x2, family=?binomial?) coefficients: 0.245 x1, 0.021 x2 I?ve read that it is difficult to compare models with and without interaction terms, but nothing regarding how one might go about doing so. Should interaction models be averaged differently or separately than models without interaction terms? Is there another way to approach this? Thanks in advance, Leslie
Leslie Young <leslie.young101 <at> gmail.com> writes:> > I?ve used logistic regression to create models to assess the effect of > 3 variables on the presence or absence of a species, including the > interaction terms between variables and model averaging using MuMI: > model.avg > > The top models (delta<4) include several models with interaction terms > and some models without; model weights are quite low for all models > (<0.25). My problem is that the models with interactions have negative > coefficients on the variables with a positive interaction term whereas > the same model without an interaction has positive coefficients. > MuMIn: model.avg averages all these models together, so the > relationship is washed out (CI overlaps 0). > > Eg. > > mod1<-glm(presence ~ x1*x2, family=?binomial?) > coefficients: -0.661 x1, -0.043 x2, 0.02 x1:x2 > > mod2 <- glm(presence ~ x1 + x2, family=?binomial?) > coefficients: 0.245 x1, 0.021 x2 > > I?ve read that it is difficult to compare models with and without > interaction terms, but nothing regarding how one might go about doing > so. Should interaction models be averaged differently or separately > than models without interaction terms? Is there another way to > approach this?The tricky aspect of comparing models with and without interaction terms is that the main effect parameters have different meanings when the interactions are included. It looks like your parameters are both continuous, so I'll discuss things in this context. In the interaction model, the x1 parameter gives the expected change in log-odds (logit probability) for a 1-unit increase in the corresponding predictor variable **when x2 is equal to 0**. In the non-interaction model, the x1 parameter gives the *average* expected change in log-odds across the distribution of x2 values observed in the data sets. It will help a bit if you follow Schielzeth [Schielzeth, Holger. 2010. Simple means to improve the interpretability of regression coefficients. Methods in Ecology and Evolution 9999, no. 9999. doi:10.1111/j.2041-210X.2010.00012.x.] and mean-correct your predictors before fitting the model. I have a bigger problem (which perhaps luckily for you is not shared with much of the ecological community), which is that usually people who are model-averaging *parameters* (rather than *predictions*) are essentially trying to use information-theoretic approaches to test hypotheses ...