Hi, Is it possible to compare two GAM objects created with the gam() function from the mgcv package. I use a slightly modified version of anova.glm() named anova.gam(), modified from John Fox (2002). It often gives me some aberant responses, especially with "F" test. I use a quasibinomial model and scale (dispersion) is calculated and used in the calculation of the F value. Does someone already tried this or does someone knows if all this is theoretically possible ? Best wishes, Fabien Fivaz ================================================= Fabien Fivaz Centre Suisse de Cartographie de la Faune (CSCF) Terreaux 14 2000 Neuch?tel - Switzerland- E-mail: fabien.fivaz at ie-zea.unil.ch ================================================ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear Fabien, At 03:03 PM 11/13/2002 +0100, Fabien.Fivaz at ie-zea.unil.ch wrote:>Is it possible to compare two GAM objects created with the gam() function >from the mgcv package. I use a slightly modified version of anova.glm() >named anova.gam(), modified from John Fox (2002). It often gives me some >aberant responses, especially with "F" test. I use a quasibinomial model >and scale (dispersion) is calculated and used in the calculation of the F >value. Does someone already tried this or does someone knows if all this >is theoretically possible ?(Your reference is probably to the anova.gam function in the script file for the on-line appendix on nonparametric regression to my R and S-PLUS Companion to Applied Regression.) I don't have specific information about your question, but can provide some background: Hastie and Tibshirani, Generalized Additive Models (Chapman and Hall, 1990), discuss analysis of variance and analysis of deviance procedures for GAMs, providing a heuristic justification and some simulation evidence. You'll find discussion of the subject on pages 65-67, 155-156, and 292-294. I hope that this helps, John ----------------------------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario, Canada L8S 4M4 email: jfox at mcmaster.ca phone: 905-525-9140x23604 web: www.socsci.mcmaster.ca/jfox ----------------------------------------------------- -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> > Is it possible to compare two GAM objects created with the gam() function > from the mgcv package. I use a slightly modified version of > anova.glm() named anova.gam(), modified from John Fox (2002). It often > gives me some aberant responses, especially with "F" test. I use a > quasibinomial model and scale (dispersion) is calculated and used in the > calculation of the F value. Does someone already tried this or does > someone knows if all this is theoretically possible ?- I'm slightly uncomfortable about trying to do this if you are selecting the degree of smoothing by generalized cross validation (GCV) or unbiased risk estimation (UBRE) which is the mgcv default. Both of these are basically mean square error criteria, and it seems to me more logically consistent to compare models by comparing their GCV or UBRE scores as appropriate. If you want to model select by hypothesis testing then it's possibly better to set up your GAMs using pure regression splines, rather than the penalized regression splines that are the mgcv default. mgcv allows you to do this. For example: mod.1<-gam(y~s(v,u,k=21,fx=TRUE)+s(x,k=6,fx=TRUE)) fits a model involving a 20 df smooth of u and v and a 5 df smooth of x (k is the basis dimension of the smooth, but you lose a df through GAM identifiability constraints). The fitted model here is an un-penalized GLM, so standard distributional results for GLMs hold. The basis use maintians nested-ness of models, thereby allowing use of analysis of deviance/variance. For example: mod.0<-gam(y~s(v,u,k=11,fx=TRUE)+s(x,k=4,fx=TRUE)) is strictly nested within mod.1. The nesting is achieved by using a carefully chosen "optimal" basis for each smooth, based on optimal low rank approximation of thin plate splines: details out early next year in JRSSB, but I can send you a pre-print if you are interested. If you really must mix hypothesis testing with MSE model selection then I'd be inclined to use the very approximate p-values for terms reported by summary.gam() - but please read the warnings in the help file first! Simon ______________________________________________________________________> Simon Wood snw at st-and.ac.uk http://www.ruwpa.st-and.ac.uk/simon.html > CREEM, The Observatory, Buchanan Gardens, St Andrews, Fife KY16 9LZ UK > Direct telephone: (0)1334 461844 Indirect fax: (0)1334 463748-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._