Martijn Wieling
2012-May-08 14:01 UTC
[R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?
Dear useRs, I am using mgcv version 1.7-16. When I create a model with a few non-linear terms and a random intercept for (in my case) country using s(Country,bs="re"), the representative line in my model (i.e. approximate significance of smooth terms) for the random intercept reads: edf Ref.df F p-value s(Country) 36.127 58.551 0.644 0.982 Can I interpret this as there being no support for a random intercept for country? However, when I compare the simpler model to the model including the random intercept, the latter appears to be a significant improvement.> anova(gam1,gam2,test="F")Model 1: .... Model 2: .... + s(BirthNation, bs="re") Resid. Df Resid. Dev Df Deviance F Pr(>F) 1 789.44 416.54 2 753.15 373.54 36.292 43.003 2.3891 1.225e-05 *** I hope somebody could help me in how I should proceed in these situations. Do I include the random intercept or not? I also have a related question. When I used to create a mixed-effects regression model using lmer and included e.g., an interaction in the fixed-effects structure, I would test if the inclusion of this interaction was warranted using anova(lmer1,lmer2). It then would show me that I invested 1 additional df and the resulting (possibly significant) improvement in fit of my model. This approach does not seem to work when using gam. In this case an apparent investment of 1 degree of freedom for the interaction, might result in an actual decrease of the degrees of freedom invested by the total model (caused by a decrease of the edf's of splines in the model with the interaction). In this case, how would I proceed in determining if the model including the interaction term is better? With kind regards, Martijn Wieling -- ******************************************* Martijn Wieling http://www.martijnwieling.nl wieling at gmail.com +31(0)614108622 ******************************************* University of Groningen http://www.rug.nl/staff/m.b.wieling
Simon Wood
2012-May-11 15:43 UTC
[R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?
Dear Martijn, Thanks for the off line code and data: very helpful. The answer to this is something of a 'can of worms'. Starting with the p-value inconsistency. The problem here really is that neither test is well justified in the case of s(...,"re") terms (and not having realised the extent of the problem it's not flagged properly). In the case of the p-value from `summary', the p-value is computed as if the random effect were any other smooth. However the theory on which the p-values for smooths rests does not hold for "re" terms (basically the usual notion of smoothing bias is meaningless in the "re" case, and "re" terms can not usually be well approximated by truncated eigen approximations). The upshot is that you can get bias toward accepting the null. I'll revert to doing something more sensible for "re" terms for the next release, but it still won't be great, I guess. The p-value from the comparison of models via 'anova' is equally suspect for "re" terms. Basically, this test is justified as a rough approximation in the case of usual smooth models, by the fact that we can approximate the model smooths by unpenalized reduced rank eigen approximations having degrees of freedom set to the effective degrees of freedom of the smooths. Again, however, such reduced rank approximations are generally not available for "re" terms, and I don't know if there is then a decent justification for the test in this case. 'AIC' might then be seen as the answer for model selection, but Greven and Kneib (2010, Biometrika), show that this is biased towards selecting the larger random effects model in this case (they provide a correction, but I'm not sure how easy it is to apply here). You are left with a couple of sensible possibilities that are easy to use, if it's not clear from the estimates that the term is zero. Both involve using gam(...,method="REML") or gam(...,method="ML"). 1. use gam.vcomp to get a confidence interval for the "re" variance component. If this is bounded well away from zero, then the result is clear. 2. Run a glrt test based on twice the difference in ML/REML score reported for the 2 models (c.f. chisq on 1 df for your case). This suffers from the usual problem of using a glrt test to test a variance component for equality to zero. (AIC based on this marginal likelihood doesn't fix the problem either --- see Greven and Kneib, again). The second issue, that adding a fixed effect can reduce the EDF, while improving the fit, is less of a problem, I think. If I'm happy to select the degree of smoothness of a model by GCV, REML or whatever, then I should also be happy to accept that the model with the fewer degrees of freedom, but more variables, is better than the one with more degrees of freedom and fewer variables. (The converse that I would ever reject the better fitting, less complex model is obviously perverse). You can get similar effects in ordinary linear modelling: adding an important predictor gives such an improvement in fit that you can drop polynomial dependencies on other predictors, so a model with more degrees of freedom but fewer variables does worse than one with fewer degrees of freedom and more variables... the issue is just a bit more prominent when fitting GAMs because part of model selection is integrated with fitting in this case. best, Simon > 08/05/12 15:01, Martijn Wieling wrote:> Dear useRs, > > I am using mgcv version 1.7-16. When I create a model with a few > non-linear terms and a random intercept for (in my case) country using > s(Country,bs="re"), the representative line in my model (i.e. > approximate significance of smooth terms) for the random intercept > reads: > edf Ref.df F p-value > s(Country) 36.127 58.551 0.644 0.982 > > Can I interpret this as there being no support for a random intercept > for country? However, when I compare the simpler model to the model > including the random intercept, the latter appears to be a significant > improvement. > >> anova(gam1,gam2,test="F") > Model 1: .... > Model 2: .... + s(BirthNation, bs="re") > Resid. Df Resid. Dev Df Deviance F Pr(>F) > 1 789.44 416.54 > 2 753.15 373.54 36.292 43.003 2.3891 1.225e-05 *** > > I hope somebody could help me in how I should proceed in these > situations. Do I include the random intercept or not? > > I also have a related question. When I used to create a mixed-effects > regression model using lmer and included e.g., an interaction in the > fixed-effects structure, I would test if the inclusion of this > interaction was warranted using anova(lmer1,lmer2). It then would show > me that I invested 1 additional df and the resulting (possibly > significant) improvement in fit of my model. > > This approach does not seem to work when using gam. In this case an > apparent investment of 1 degree of freedom for the interaction, might > result in an actual decrease of the degrees of freedom invested by the > total model (caused by a decrease of the edf's of splines in the model > with the interaction). In this case, how would I proceed in > determining if the model including the interaction term is better? > > With kind regards, > Martijn Wieling > > -- > ******************************************* > Martijn Wieling > http://www.martijnwieling.nl > wieling at gmail.com > +31(0)614108622 > ******************************************* > University of Groningen > http://www.rug.nl/staff/m.b.wieling > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283
Simon Wood
2012-May-23 09:28 UTC
[R] mgcv: inclusion of random intercept in model - based on p-value of smooth or anova?
Having looked at this further, I've made some changes in mgcv_1.7-17 to the p-value computations for terms that can be penalized to zero during fitting (e.g. s(x,bs="re"), s(x,m=1) etc). The Wald statistic based p-values from summary.gam and anova.gam (i.e. what you get from e.g. anova(a) where a is a fitted gam object) are quite well founded for smooth terms that are non-zero under full penalization (e.g. a cubic spline is a straight line under full penalization). For such smooths, an extension of Nychka's (1988) result on CI's for splines gives a well founded distributional result on which to base a Wald statistic. However, the Nychka result requires the smoothing bias to be substantially less than the smoothing estimator variance, and this will often not be the case if smoothing can actually penalize a term to zero (to understand why, see argument in appendix of Marra & Wood, 2012, Scandinavian Journal of Statistics, 39,53-74). Simulation testing shows that this theoretical concern has serious practical consequences. So for terms that can be penalized to zero, alternative approximations have to be used, and these are now implemented in mgcv_1.7-17 (see ?summary.gam). The approximate test performed by anova(a,b) (a and b are fitted "gam" objects) is less well founded. It is a reasonable approximation when each smooth term in the models could in principle be well approximated by an unpenalized term of rank approximately equal to the edf of the smooth term, but otherwise the p-values produced are likely to be much too small. In particular simulation testing suggests that the test is not to be trusted with s(...,bs="re") terms, and can be poor if the models being compared involve any terms that can be penalized to zero during fitting. (Although the mechanisms are a little different, this is similar to the problem we would have if the models were viewed as regular mixed models and we tried to use a GLRT to test variance components for equality to zero). These issues are now documented in ?anova.gam and ?summary.gam... Simon On 08/05/12 15:01, Martijn Wieling wrote:> Dear useRs, > > I am using mgcv version 1.7-16. When I create a model with a few > non-linear terms and a random intercept for (in my case) country using > s(Country,bs="re"), the representative line in my model (i.e. > approximate significance of smooth terms) for the random intercept > reads: > edf Ref.df F p-value > s(Country) 36.127 58.551 0.644 0.982 > > Can I interpret this as there being no support for a random intercept > for country? However, when I compare the simpler model to the model > including the random intercept, the latter appears to be a significant > improvement. > >> anova(gam1,gam2,test="F") > Model 1: .... > Model 2: .... + s(BirthNation, bs="re") > Resid. Df Resid. Dev Df Deviance F Pr(>F) > 1 789.44 416.54 > 2 753.15 373.54 36.292 43.003 2.3891 1.225e-05 *** > > I hope somebody could help me in how I should proceed in these > situations. Do I include the random intercept or not? > > I also have a related question. When I used to create a mixed-effects > regression model using lmer and included e.g., an interaction in the > fixed-effects structure, I would test if the inclusion of this > interaction was warranted using anova(lmer1,lmer2). It then would show > me that I invested 1 additional df and the resulting (possibly > significant) improvement in fit of my model. > > This approach does not seem to work when using gam. In this case an > apparent investment of 1 degree of freedom for the interaction, might > result in an actual decrease of the degrees of freedom invested by the > total model (caused by a decrease of the edf's of splines in the model > with the interaction). In this case, how would I proceed in > determining if the model including the interaction term is better? > > With kind regards, > Martijn Wieling > > -- > ******************************************* > Martijn Wieling > http://www.martijnwieling.nl > wieling at gmail.com > +31(0)614108622 > ******************************************* > University of Groningen > http://www.rug.nl/staff/m.b.wieling > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Simon Wood, Mathematical Science, University of Bath BA2 7AY UK +44 (0)1225 386603 http://people.bath.ac.uk/sw283
Apparently Analagous Threads
- GAM Chi-Square Difference Test
- group interaction in a varying coeff. model (mgcv)
- mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
- Problem with simple random slope in gam and bam (mgcv package)
- Problem extracting enough coefs from gam (mgcv package)