Hello, in the paper "Avoiding the effects of concurvity in GAM's .." of Figueiras et al. (2003) it is mentioned that in GLM collinearity is taken into account in the calc of se but not in GAM (-> results in confidence interval too narrow, p-value understated, GAM S-Plus version). I haven't found any references to GAM and concurvity or collinearity on the R page. And I wonder if the R version of Gam differ in this point. Another question would be, what the best manual way of a variable selection is, due to the lack of a stepwise procedure for GAM. Including the first variables, add var1, if GCV improves (what would be considered as improvement?) or P-value signif., keep it, otherwise drop it - add var 2, and so on? thanks in advance, cheers Martin
As someone (Simon Wood, for instance) could explain much better and as it is stressed in the help files of the mgcv pakage (the package including the gam() function) gam in R is not a clone of gam in S+. S+ uses backfitting while R uses penalized splines (see the references inside gam() function). The approaches are quite different and can lead to substantial differences in particular cases, for instance with concurvity. best, vito PS Can you point out the exact reference for "Figueiras et al. (2003)"? ----- Original Message ----- From: Martin Wegmann <mailinglist.wegmann at gmx.net> To: R-list <r-help at stat.math.ethz.ch> Sent: Tuesday, September 16, 2003 3:47 PM Subject: [R] gam and concurvity> Hello, > > in the paper "Avoiding the effects of concurvity in GAM's .." of Figueiraset> al. (2003) it is mentioned that in GLM collinearity is taken into accountin> the calc of se but not in GAM (-> results in confidence interval toonarrow,> p-value understated, GAM S-Plus version). I haven't found any referencesto> GAM and concurvity or collinearity on the R page. And I wonder if the R > version of Gam differ in this point. > Another question would be, what the best manual way of a variableselection> is, due to the lack of a stepwise procedure for GAM. Including the first > variables, add var1, if GCV improves (what would be considered as > improvement?) or P-value signif., keep it, otherwise drop it - add var 2,and> so on? > > thanks in advance, cheers Martin > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
On Tue, 16 Sep 2003, Martin Wegmann wrote:> Hello, > > in the paper "Avoiding the effects of concurvity in GAM's .." of Figueiras et > al. (2003) it is mentioned that in GLM collinearity is taken into account in > the calc of se but not in GAM (-> results in confidence interval too narrow, > p-value understated, GAM S-Plus version). I haven't found any references to > GAM and concurvity or collinearity on the R page. And I wonder if the R > version of Gam differ in this point.They do. R gam() uses penalised splines, resulting in an easily managed design matrix. S-PLUS gam() uses smoothing splines, and (until recently) there wasn't any known feasible formula for the standard errors. However: 1/ `Concurvity' is a serious problem only for a few extreme uses of gam. Even in the air pollution time series studies that provoked the recent fuss, there impact is really important only in studies that very aggressively removed seasonal patterns or in data with huge seasonal variations (eg inland Canada). 2/ These two cases are precisely the cases where the results are sensitive to the choice of time scale at which seasonal variation confounds the association, a choice that is not identifiable from the data. 3/ Neither S-PLUS or R gam() standard errors incorporate the uncertainty in an automatically chosen smoothing parameter. 4/ Trevor Hastie and colleagues have written software for calculating correct standard errors for S-PLUS gam. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlumley at u.washington.edu University of Washington, Seattle