Sacha Viquerat
2011-Mar-18 12:35 UTC
[R] general question about dropping terms of glm model fits
hello dear list! as I am currently helping someone with their statistical analysis of a count survey, I stumbled upon a very basic question upon model optimization: when fitting a model like: model<-lmer(abundance~(x+y+z)^3,family=poisson,...) in which x,y,z are continuous abiotic parameters such as po4 concentration, no2-concentration, which terms / interaction terms would you recommend removing FIRST? the ones of lowest significance (i.e. the ones with highest p-value) OR the ones with the most complex interaction structure (even though p-values may be low-ish)? another question just popped in my mind: let's say I've reduced my model to significant terms: y ~ temperature + po4 + po4:temperature and I know that correlation between po4 and temperature is high. would you say that this is reason enough to remove the interaction term? any opinion is a welcome opinion!
Frank Harrell
2011-Mar-18 22:37 UTC
[R] general question about dropping terms of glm model fits
It will distort statistical inference to drop any terms on the basis of P-values, AIC, etc.. If you drop terms, use the hierarchy principle. High correlations between variables don't necessarily invalidate a test. Frank Sacha Viquerat-2 wrote:> > hello dear list! > as I am currently helping someone with their statistical analysis of a > count survey, I stumbled upon a very basic question upon model > optimization: > > when fitting a model like: > > model<-lmer(abundance~(x+y+z)^3,family=poisson,...) > > in which x,y,z are continuous abiotic parameters such as po4 > concentration, no2-concentration, which terms / interaction terms would > you recommend removing FIRST? > > the ones of lowest significance (i.e. the ones with highest p-value) OR > > the ones with the most complex interaction structure (even though > p-values may be low-ish)? > > another question just popped in my mind: > > let's say I've reduced my model to significant terms: > > y ~ temperature + po4 + po4:temperature > > and I know that correlation between po4 and temperature is high. would > you say that this is reason enough to remove the interaction term? > > any opinion is a welcome opinion! > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/general-question-about-dropping-terms-of-glm-model-fits-tp3387085p3388629.html Sent from the R help mailing list archive at Nabble.com.