astrzelczak@ps.pl
2005-Oct-12  09:04 UTC
[R] step.gam and number of tested smooth functions
Hi, I'm working with step.gam in gam package. I'm interested both in spline and lowess functions and when I define all the models that I'm interested in I get something like that:> gam.object.ALC<-gam(X143S~ALC,data=dane,family=binomial) >step.gam.ALC<-step.gam(gam.object.ALC,scope=list("ALC"=~1+ALC+s(ALC,2)+s(ALC,3)+s(ALC,4)+s(ALC,6)+s(ALC,8)+lo(ALC,degree=1,span=.5)+lo(ALC,degree=2,span=.5)+lo(ALC,degree=1,span=.25)+lo(ALC,degree=2,span=.25))) Start: X143S ~ ALC; AIC= 104.0815 Trial: X143S ~ 1; AIC= 111.1054 Trial: X143S ~ s(ALC, 2); AIC= 103.3325 Step : X143S ~ s(ALC, 2) ; AIC= 103.3325 Trial: X143S ~ s(ALC, 3); AIC= 102.9598 Step : X143S ~ s(ALC, 3) ; AIC= 102.9598 Trial: X143S ~ s(ALC, 4); AIC= 102.2103 Step : X143S ~ s(ALC, 4) ; AIC= 102.2103 Trial: X143S ~ s(ALC, 6); AIC= 102.4548 I have impression that the algorithm stops when the next trial gives higher AIC without examining further functions. When I deleted some of the spline functions that were worse than s(ALC,4) I got: > step.gam.ALC<-step.gam(gam.object.ALC,scope=list("ALC"=~1+ALC++s(ALC,4)+lo(ALC,degree=1,span=.5)+lo(ALC,degree=2,span=.5)+lo(ALC,degree=1,span=.25)+lo(ALC,degree=2,span=.25))) Start: X143S ~ ALC; AIC= 104.0815 Trial: X143S ~ 1; AIC= 111.1054 Trial: X143S ~ s(ALC, 4); AIC= 102.2103 Step : X143S ~ s(ALC, 4) ; AIC= 102.2103 Trial: X143S ~ lo(ALC, degree = 1, span = 0.5); AIC= 99.8127 Step : X143S ~ lo(ALC, degree = 1, span = 0.5) ; AIC= 99.8127 Trial: X143S ~ lo(ALC, degree = 2, span = 0.5); AIC= 100.5275 Lowess turned out to be better in this situation. Is there any way to examine all the models without stopping when AIC is higher in the next trial? Or maybe manual handling is the only solution? thanks for help in advance Agnieszka Strzelczak
Prof Brian Ripley
2005-Oct-12  09:20 UTC
[R] step.gam and number of tested smooth functions
step.gam is a tricky function to use correctly. You will need to consult the original documentation (in Chambers & Hastie ca 1992) or ask the package author for help. BTW, it uses loess not lowess. On Wed, 12 Oct 2005 astrzelczak at ps.pl wrote:> > Hi, > > I'm working with step.gam in gam package. I'm interested both in spline and > lowess functions and when I define all the models that I'm interested in I get > something like that: > >> gam.object.ALC<-gam(X143S~ALC,data=dane,family=binomial) >> > step.gam.ALC<-step.gam(gam.object.ALC,scope=list("ALC"=~1+ALC+s(ALC,2)+s(ALC,3)+s(ALC,4)+s(ALC,6)+s(ALC,8)+lo(ALC,degree=1,span=.5)+lo(ALC,degree=2,span=.5)+lo(ALC,degree=1,span=.25)+lo(ALC,degree=2,span=.25))) > Start: X143S ~ ALC; AIC= 104.0815 > Trial: X143S ~ 1; AIC= 111.1054 > Trial: X143S ~ s(ALC, 2); AIC= 103.3325 > Step : X143S ~ s(ALC, 2) ; AIC= 103.3325 > > Trial: X143S ~ s(ALC, 3); AIC= 102.9598 > Step : X143S ~ s(ALC, 3) ; AIC= 102.9598 > > Trial: X143S ~ s(ALC, 4); AIC= 102.2103 > Step : X143S ~ s(ALC, 4) ; AIC= 102.2103 > > Trial: X143S ~ s(ALC, 6); AIC= 102.4548 > > I have impression that the algorithm stops when the next trial gives higher AIC > without examining further functions. When I deleted some of the spline functions > that were worse than s(ALC,4) I got: > > > > step.gam.ALC<-step.gam(gam.object.ALC,scope=list("ALC"=~1+ALC++s(ALC,4)+lo(ALC,degree=1,span=.5)+lo(ALC,degree=2,span=.5)+lo(ALC,degree=1,span=.25)+lo(ALC,degree=2,span=.25))) > Start: X143S ~ ALC; AIC= 104.0815 > Trial: X143S ~ 1; AIC= 111.1054 > Trial: X143S ~ s(ALC, 4); AIC= 102.2103 > Step : X143S ~ s(ALC, 4) ; AIC= 102.2103 > > Trial: X143S ~ lo(ALC, degree = 1, span = 0.5); AIC= 99.8127 > Step : X143S ~ lo(ALC, degree = 1, span = 0.5) ; AIC= 99.8127 > > Trial: X143S ~ lo(ALC, degree = 2, span = 0.5); AIC= 100.5275 > > Lowess turned out to be better in this situation. Is there any way to examine > all the models without stopping when AIC is higher in the next trial? Or maybe > manual handling is the only solution? > > thanks for help in advance > > Agnieszka Strzelczak > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595