Michael
2006-Mar-16 09:30 UTC
[R] Did I use "step" function correctly? (Is R's step() function reliable?)
Hi all, I put up an exhaustive model to use R's "step" function: ------------------------ mygam=gam(col1 ~ 1 + col2 + col3 + col4 + col2 ^ 2 + col3 ^ 2 + col4 ^ 2 + col2 ^ 3 + col3 ^ 3 + col4 ^ 3 + s(col2, 1) + s(col3, 1) + s(col4, 1) + s(col2, 2) + s(col3, 2) + s(col4, 2) + s(col2, 3) + s(col3, 3) + s(col4, 3) + s(col2, 4) + s(col3, 4) + s(col4, 4) + s(col2, 5) + s(col3, 5) + s(col4, 5) + s(col2, 6) + s(col3, 6) + s(col4, 6) + s(col2, 7) + s(col3, 7) + s(col4, 7) + s(col2, 8) + s(col3, 8) + s(col4, 8) + s(col2, 9) + s(col3, 9) + s(col4, 9), data=X); mystep=step(mygam); --------------------- After a long list, the following are two lowest AIC: Step: AIC= 152.1 col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3) Step: AIC= 153.45 col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) ----------------------------------------------- However, the lowest AIC model, " col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3)" does not give the best Residual Deviance. Instead, the model "mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6), data=X)" is the best, in fact, I found that as I increase the "degree-of-freedom", it always give better residual deviance, lower than that of the "best" model returned by "step" function... Please see below. I am wondering if I need to increase "degree-of-freedom" all the way up... Perhaps to avoid overfitting, I should do a cross validation. Is there an automatic Cross Validation inside "step" or "gam"? Is "step" function result reliable? Or perhaps I used it incorrectly? Thanks a lot, Michael. --------------------------> > mygam1=gam(col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4,3), data=X);> > mygam2=gam(col1 ~ col2 + col3 + col4 , data=X); > > mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6), data=X); > > mygam1Call: gam(formula = col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3), data = X) Degrees of Freedom: 110 total; 100.9999 Residual Residual Deviance: 20.98365> mygam2Call: gam(formula = col1 ~ col2 + col3 + col4, data = X) Degrees of Freedom: 110 total; 107 Residual Residual Deviance: 27.84808> mygam3Call: gam(formula = col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6), data = X) Degrees of Freedom: 110 total; 91.99957 Residual Residual Deviance: 18.45776> > anova(mygam1, mygam2, mygam3);Analysis of Deviance Table Model 1: col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3) Model 2: col1 ~ col2 + col3 + col4 Model 3: col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6) Resid. Df Resid. Dev Df Deviance P(>|Chi|) 1 100.9999 20.9836 2 107.0000 27.8481 -6.0001 -6.8644 6.115e-06 3 91.9996 18.4578 15.0004 9.3903 3.958e-05 [[alternative HTML version deleted]]
Berton Gunter
2006-Mar-16 17:14 UTC
[R] Did I use "step" function correctly? (Is R's step() functionreliable?)
The questions you ask lead to far more complex issues than you are aware of. As a result, there are no "good" -- nor certainly any simple -- answers to them. The deeper issue is how to choose an appropriate model, balancing complexity (overfitting) with parsimony (predictive/scientific validity). A few miscellaneous comments are: 1) Increasing model complexity (more parameters, more df for the model) for nested models must **always** improve (or cannot harm, anyway) the fit; this is just a mathematical identity, and so your comment to this effect is basically meaningless. 2) The hard question is: does increased complexity improve the fit **enough** to believe that it is "meaningful"? 3) AIC, BIC, cross-validation, extra sums of squares principles, anovas via likelihood ratio tests, etc. etc. are all addressed at this question. All are sometimes useful and sometimes produce junk. 4) There are now many statisticians and statistical learners who believe that the the basic question -- what is the right model? -- is fundamentally misleading. Their view is (approximately, I'm no expert) that there are always several (many?) different models that are essentially equally good. So they cast the issue as one of prediction and eschew a single model altogether. Instead, they have developed various approaches to creating "ensembles" of models that are basically just prediction engines largely incapable of interpretation. Random forests, boosting, and bagging are some of the buzzwords here, and R has packages for all. Brian Ripley has at least one excellent presentation on these issues on his website (his presentation on John Nelder's 80th birthday. Unfortunately, while I have the paper, I no longer have the link. Perhaps he might re-post it). You might also wish to have a look at some of Leo Breiman's writings on these issues. I repeat: I am not an expert on these matters, and my brief comments above no doubt already contain distortions and inaccuracies. I would greatly appreciate it if those with real expertise would correct any of the more egregious errors. Finally, if I may hazard a personal opinion regarding use of stepAIC or other stepwise fitting methods: Beware! -- there lie dragons! They can be an excellent way to generate complete spurious models. Cheers, Bert -- Bert Gunter Genentech Non-Clinical Statistics South San Francisco, CA "The business of the statistician is to catalyze the scientific learning process." - George E. P. Box> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Michael > Sent: Thursday, March 16, 2006 1:31 AM > To: R-help at stat.math.ethz.ch > Subject: [R] Did I use "step" function correctly? (Is R's > step() functionreliable?) > > Hi all, > > I put up an exhaustive model to use R's "step" function: > > ------------------------ > > mygam=gam(col1 ~ 1 > + col2 + col3 + col4 > + col2 ^ 2 + col3 ^ 2 + col4 ^ 2 > + col2 ^ 3 + col3 ^ 3 + col4 ^ 3 > + s(col2, 1) + s(col3, 1) + s(col4, 1) > + s(col2, 2) + s(col3, 2) + s(col4, 2) > + s(col2, 3) + s(col3, 3) + s(col4, 3) > + s(col2, 4) + s(col3, 4) + s(col4, 4) > + s(col2, 5) + s(col3, 5) + s(col4, 5) > + s(col2, 6) + s(col3, 6) + s(col4, 6) > + s(col2, 7) + s(col3, 7) + s(col4, 7) > + s(col2, 8) + s(col3, 8) + s(col4, 8) > + s(col2, 9) + s(col3, 9) + s(col4, 9), > data=X); > > mystep=step(mygam); > > --------------------- > After a long list, the following are two lowest AIC: > > Step: AIC= 152.1 > col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3) > > > Step: AIC= 153.45 > col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) > ----------------------------------------------- > > However, the lowest AIC model, " col1 ~ col2 + col3 + col4 + > s(col2, 3) + > s(col3, 3) + s(col4, 3)" does not give the best Residual Deviance. > > Instead, the model "mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) > + s(col4, 6), > data=X)" is the best, in fact, > > I found that as I increase the "degree-of-freedom", it always > give better > residual deviance, lower than that of the "best" model > returned by "step" > function... Please see below. > > I am wondering if I need to increase "degree-of-freedom" all > the way up... > Perhaps to avoid overfitting, I should do a cross validation. > Is there an > automatic Cross Validation inside "step" or "gam"? > > Is "step" function result reliable? Or perhaps I used it incorrectly? > > Thanks a lot, > > Michael. > > -------------------------- > > > > > mygam1=gam(col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, > 3) + s(col4, > 3), data=X); > > > > mygam2=gam(col1 ~ col2 + col3 + col4 , data=X); > > > > mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6), data=X); > > > > mygam1 > Call: > gam(formula = col1 ~ col2 + col3 + col4 + > s(col2, 3) + s(col3, 3) + s(col4, 3), data = X) > > Degrees of Freedom: 110 total; 100.9999 Residual > Residual Deviance: 20.98365 > > mygam2 > Call: > gam(formula = col1 ~ col2 + col3 + col4, data = X) > > Degrees of Freedom: 110 total; 107 Residual > Residual Deviance: 27.84808 > > mygam3 > Call: > gam(formula = col1 ~ s(col2, 6) + s(col3, 6) + > s(col4, 6), data = X) > > Degrees of Freedom: 110 total; 91.99957 Residual > Residual Deviance: 18.45776 > > > > anova(mygam1, mygam2, mygam3); > Analysis of Deviance Table > > Model 1: col1 ~ col2 + col3 + col4 + s(col2, > 3) + s(col3, 3) + s(col4, 3) > Model 2: col1 ~ col2 + col3 + col4 > Model 3: col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6) > Resid. Df Resid. Dev Df Deviance P(>|Chi|) > 1 100.9999 20.9836 > 2 107.0000 27.8481 -6.0001 -6.8644 6.115e-06 > 3 91.9996 18.4578 15.0004 9.3903 3.958e-05 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
Michael
2006-Mar-17 07:17 UTC
[R] Did I use "step" function correctly? (Is R's step() functionreliable?)
Hi L.Y, Thank you for your advice. Are you talking about Trevor Hastie's gam()? I did not see anywhere from the result that it has an automatic Cross Validation? I also could not verify that the gam() function will automatically find the degree-of-freedom if I don't specify the df, and just use tems such as s(col1) + s(col2) ... Does the "step()" function also include the gam() with CV and auto-tweaking for df? I wondered if I have called "step()" correctly, because it looks to me that it only run at a very short time(1second), and immediately returned two models, in fact has even larger residual deviance than the model I have provided to it initially... (obviously I've included every possibilities in the initial model, and rely on the step() function to cut off some terms for me...) Thanks a lot! On 3/16/06, Dr L. Y Hin <lyhin@netvigator.com> wrote:> > The engine of gam() lies in a function called smooth.spline() that is > found > in the > library splines. If you leave out specifying the degree of freedom in the > formulary determination, > it will automatically specify it for you via cross-validation. The results > of model fit obtainable via > summary(mygam) will show you the "degree of freedom as choosen by the > cross-validation method". > On a more philosophical plane, Buja et al. (Ann Stat. 1989;17(2):453-510) > pointed out that the fact > that linear smoothers such as cubic splines and smoothing splines are > linear > lies in the fact that > they are x-dependent and not y-dependent. By using cross-validation, you > will invariably involve the > use of y, which renders the determination of degree of freedom > y-dependent, > hence the smoothing > parameter \lambda y-dependent, and for such a case, the smoothing matrix, > strictly speaking, > non-linear becasue S= (I + \lambda * K)^-1 in the non weighted form with > unique x-points. > > If you increase the degree of freedom, the \lambda decreases, to a point > where you will efffectively > have a straightforward interpolation of points on the graph. Conversely, > if > \lambda is increased, > the smoothing line reduces to a linear regression line through all the > points. > > In my opinion, AIC and Residual sum of squares are competing tools looking > for the best fit. > The minimum of AIC and that of RSS may not concur. If you believe in AIC, > then I would assume > you also believe that it is a better tool than RSS in that the former uses > an information theoretic > approach, which is not sensitive to offset in accuracy due to penalization > of outliers. Following that, > I would disregard RSS and go according to what AIC tells me. > > I don't think you have used step.gam incorrectly, but I think you have > been > observant enough to > realize not all statistical tools agree all the times :) > > Lin > > ----- Original Message ----- > From: "Michael" <comtech.usa@gmail.com> > To: <R-help@stat.math.ethz.ch> > Sent: Thursday, March 16, 2006 5:30 PM > Subject: [R] Did I use "step" function correctly? (Is R's step() > functionreliable?) > > > > Hi all, > > > > I put up an exhaustive model to use R's "step" function: > > > > ------------------------ > > > > mygam=gam(col1 ~ 1 > > + col2 + col3 + col4 > > + col2 ^ 2 + col3 ^ 2 + col4 ^ 2 > > + col2 ^ 3 + col3 ^ 3 + col4 ^ 3 > > + s(col2, 1) + s(col3, 1) + s(col4, 1) > > + s(col2, 2) + s(col3, 2) + s(col4, 2) > > + s(col2, 3) + s(col3, 3) + s(col4, 3) > > + s(col2, 4) + s(col3, 4) + s(col4, 4) > > + s(col2, 5) + s(col3, 5) + s(col4, 5) > > + s(col2, 6) + s(col3, 6) + s(col4, 6) > > + s(col2, 7) + s(col3, 7) + s(col4, 7) > > + s(col2, 8) + s(col3, 8) + s(col4, 8) > > + s(col2, 9) + s(col3, 9) + s(col4, 9), > > data=X); > > > > mystep=step(mygam); > > > > --------------------- > > After a long list, the following are two lowest AIC: > > > > Step: AIC= 152.1 > > col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3) > > > > > > Step: AIC= 153.45 > > col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) > > ----------------------------------------------- > > > > However, the lowest AIC model, " col1 ~ col2 + col3 + col4 + s(col2, 3) > + > > s(col3, 3) + s(col4, 3)" does not give the best Residual Deviance. > > > > Instead, the model "mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + s(col4, > > 6), > > data=X)" is the best, in fact, > > > > I found that as I increase the "degree-of-freedom", it always give > better > > residual deviance, lower than that of the "best" model returned by > "step" > > function... Please see below. > > > > I am wondering if I need to increase "degree-of-freedom" all the way > up... > > Perhaps to avoid overfitting, I should do a cross validation. Is there > an > > automatic Cross Validation inside "step" or "gam"? > > > > Is "step" function result reliable? Or perhaps I used it incorrectly? > > > > Thanks a lot, > > > > Michael. > > > > -------------------------- > > > >> > >> mygam1=gam(col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + > s(col4, > > 3), data=X); > >> > >> mygam2=gam(col1 ~ col2 + col3 + col4 , data=X); > >> > >> mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6), data=X); > >> > >> mygam1 > > Call: > > gam(formula = col1 ~ col2 + col3 + col4 + > > s(col2, 3) + s(col3, 3) + s(col4, 3), data = X) > > > > Degrees of Freedom: 110 total; 100.9999 Residual > > Residual Deviance: 20.98365 > >> mygam2 > > Call: > > gam(formula = col1 ~ col2 + col3 + col4, data = X) > > > > Degrees of Freedom: 110 total; 107 Residual > > Residual Deviance: 27.84808 > >> mygam3 > > Call: > > gam(formula = col1 ~ s(col2, 6) + s(col3, 6) + > > s(col4, 6), data = X) > > > > Degrees of Freedom: 110 total; 91.99957 Residual > > Residual Deviance: 18.45776 > >> > >> anova(mygam1, mygam2, mygam3); > > Analysis of Deviance Table > > > > Model 1: col1 ~ col2 + col3 + col4 + s(col2, > > 3) + s(col3, 3) + s(col4, 3) > > Model 2: col1 ~ col2 + col3 + col4 > > Model 3: col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6) > > Resid. Df Resid. Dev Df Deviance P(>|Chi|) > > 1 100.9999 20.9836 > > 2 107.0000 27.8481 -6.0001 -6.8644 6.115e-06 > > 3 91.9996 18.4578 15.0004 9.3903 3.958e-05 > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@stat.math.ethz.ch mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > >[[alternative HTML version deleted]]
Michael
2006-Mar-17 08:54 UTC
[R] Did I use "step" function correctly? (Is R's step() functionreliable?)
Dear Dr. L. Y Hin, Thank you very much for your help once again! Would it possible for the gam() just pick a default df to use? For example, its default df is 4? ?s s(x, df=4, spar=1) --------------------- s(col3, 9) is constructing the smoothing matrix using df=9, s() is a function constructing smoothing matrix...> As far as I am aware, once you specify a smoothing term, you need a > smoothing parameter to perform > the smoothing, and without finding a smoothing parameter using the > automatic cross validation measure, > gam() cannot come up with one which it can use to construct the linear > smoother, and hence the calculation > cannot proceed. Therefore, even if you have not specified a degree of > dreedom, gam has actually done so > for you. You can try fit a model, say, > try<- gam(y~s(col1)+s(col2),...) > and then look at the results by > summary(try) > and you will see a column called degree of freedom. Without automatic > cross-validation, you cannot have > results to this column. > > Can I ask what does, for example, s(col3, 9) refers to? > Clarifying that may be helpful for me to give you more tangible and > workable suggestion. > > > > > ----- Original Message ----- > *From:* Michael <comtech.usa@gmail.com> > *To:* Dr L. Y Hin <lyhin@netvigator.com> > *Cc:* R-help@stat.math.ethz.ch > *Sent:* Friday, March 17, 2006 3:17 PM > *Subject:* Re: [R] Did I use "step" function correctly? (Is R's step() > functionreliable?) > > > Hi L.Y, > > Thank you for your advice. > > Are you talking about Trevor Hastie's gam()? > > I did not see anywhere from the result that it has an automatic Cross > Validation? > > I also could not verify that the gam() function will automatically find > the degree-of-freedom if I don't specify the df, and just use > tems such as > > s(col1) + s(col2) ... > > Does the "step()" function also include the gam() with CV and > auto-tweaking for df? > > I wondered if I have called "step()" correctly, because it looks to me > that it only run at a very short time(1second), and immediately returned two > models, in fact has even larger residual deviance than the model I have > provided to it initially... (obviously I've included every possibilities in > the initial model, and rely on the step() function to cut off some terms for > me...) > > Thanks a lot! > > > > On 3/16/06, Dr L. Y Hin <lyhin@netvigator.com> wrote: > > > > The engine of gam() lies in a function called smooth.spline() that is > > found > > in the > > library splines. If you leave out specifying the degree of freedom in > > the > > formulary determination, > > it will automatically specify it for you via cross-validation. The > > results > > of model fit obtainable via > > summary(mygam) will show you the "degree of freedom as choosen by the > > cross-validation method". > > On a more philosophical plane, Buja et al. (Ann Stat. > > 1989;17(2):453-510) > > pointed out that the fact > > that linear smoothers such as cubic splines and smoothing splines are > > linear > > lies in the fact that > > they are x-dependent and not y-dependent. By using cross-validation, you > > > > will invariably involve the > > use of y, which renders the determination of degree of freedom > > y-dependent, > > hence the smoothing > > parameter \lambda y-dependent, and for such a case, the smoothing > > matrix, > > strictly speaking, > > non-linear becasue S= (I + \lambda * K)^-1 in the non weighted form with > > unique x-points. > > > > If you increase the degree of freedom, the \lambda decreases, to a point > > where you will efffectively > > have a straightforward interpolation of points on the graph. Conversely, > > if > > \lambda is increased, > > the smoothing line reduces to a linear regression line through all the > > points. > > > > In my opinion, AIC and Residual sum of squares are competing tools > > looking > > for the best fit. > > The minimum of AIC and that of RSS may not concur. If you believe in > > AIC, > > then I would assume > > you also believe that it is a better tool than RSS in that the former > > uses > > an information theoretic > > approach, which is not sensitive to offset in accuracy due to > > penalization > > of outliers. Following that, > > I would disregard RSS and go according to what AIC tells me. > > > > I don't think you have used step.gam incorrectly, but I think you have > > been > > observant enough to > > realize not all statistical tools agree all the times :) > > > > Lin > > > > ----- Original Message ----- > > From: "Michael" <comtech.usa@gmail.com> > > To: <R-help@stat.math.ethz.ch > > > Sent: Thursday, March 16, 2006 5:30 PM > > Subject: [R] Did I use "step" function correctly? (Is R's step() > > functionreliable?) > > > > > > > Hi all, > > > > > > I put up an exhaustive model to use R's "step" function: > > > > > > ------------------------ > > > > > > mygam=gam(col1 ~ 1 > > > + col2 + col3 + col4 > > > + col2 ^ 2 + col3 ^ 2 + col4 ^ 2 > > > + col2 ^ 3 + col3 ^ 3 + col4 ^ 3 > > > + s(col2, 1) + s(col3, 1) + s(col4, 1) > > > + s(col2, 2) + s(col3, 2) + s(col4, 2) > > > + s(col2, 3) + s(col3, 3) + s(col4, 3) > > > + s(col2, 4) + s(col3, 4) + s(col4, 4) > > > + s(col2, 5) + s(col3, 5) + s(col4, 5) > > > + s(col2, 6) + s(col3, 6) + s(col4, 6) > > > + s(col2, 7) + s(col3, 7) + s(col4, 7) > > > + s(col2, 8) + s(col3, 8) + s(col4, 8) > > > + s(col2, 9) + s(col3, 9) + s(col4, 9), > > > data=X); > > > > > > mystep=step(mygam); > > > > > > --------------------- > > > After a long list, the following are two lowest AIC: > > > > > > Step: AIC= 152.1 > > > col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + s(col4, 3) > > > > > > > > > Step: AIC= 153.45 > > > col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) > > > ----------------------------------------------- > > > > > > However, the lowest AIC model, " col1 ~ col2 + col3 + col4 + s(col2, > > 3) + > > > s(col3, 3) + s(col4, 3)" does not give the best Residual Deviance. > > > > > > Instead, the model "mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + > > s(col4, > > > 6), > > > data=X)" is the best, in fact, > > > > > > I found that as I increase the "degree-of-freedom", it always give > > better > > > residual deviance, lower than that of the "best" model returned by > > "step" > > > function... Please see below. > > > > > > I am wondering if I need to increase "degree-of-freedom" all the way > > up... > > > Perhaps to avoid overfitting, I should do a cross validation. Is there > > an > > > automatic Cross Validation inside "step" or "gam"? > > > > > > Is "step" function result reliable? Or perhaps I used it incorrectly? > > > > > > Thanks a lot, > > > > > > Michael. > > > > > > -------------------------- > > > > > >> > > >> mygam1=gam(col1 ~ col2 + col3 + col4 + s(col2, 3) + s(col3, 3) + > > s(col4, > > > 3), data=X); > > >> > > >> mygam2=gam(col1 ~ col2 + col3 + col4 , data=X); > > >> > > >> mygam3=gam(col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6), data=X); > > >> > > >> mygam1 > > > Call: > > > gam(formula = col1 ~ col2 + col3 + col4 + > > > s(col2, 3) + s(col3, 3) + s(col4, 3), data = X) > > > > > > Degrees of Freedom: 110 total; 100.9999 Residual > > > Residual Deviance: 20.98365 > > >> mygam2 > > > Call: > > > gam(formula = col1 ~ col2 + col3 + col4, data = X) > > > > > > Degrees of Freedom: 110 total; 107 Residual > > > Residual Deviance: 27.84808 > > >> mygam3 > > > Call: > > > gam(formula = col1 ~ s(col2, 6) + s(col3, 6) + > > > s(col4, 6), data = X) > > > > > > Degrees of Freedom: 110 total; 91.99957 Residual > > > Residual Deviance: 18.45776 > > >> > > >> anova(mygam1, mygam2, mygam3); > > > Analysis of Deviance Table > > > > > > Model 1: col1 ~ col2 + col3 + col4 + s(col2, > > > 3) + s(col3, 3) + s(col4, 3) > > > Model 2: col1 ~ col2 + col3 + col4 > > > Model 3: col1 ~ s(col2, 6) + s(col3, 6) + s(col4, 6) > > > Resid. Df Resid. Dev Df Deviance P(>|Chi|) > > > 1 100.9999 20.9836 > > > 2 107.0000 27.8481 -6.0001 -6.8644 6.115e-06 > > > 3 91.9996 18.4578 15.0004 9.3903 3.958e-05 > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide! > > > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > > > > > > > > >[[alternative HTML version deleted]]