This is covered in the helpfile, but perhaps not clearly enough.
The gam chapter in the "white book" has more details.
step.gam moves around the terms in the scope aregumnet in an ordered
fashion.
So if a scope element is
~ 1 + x +s(x,4) + s(x,8)
and the formula at some stage is ~ x + ....
then if direction="both", the routine checks both "1" and
"s(x,4)" (i.e
up or down the hierarchy by one move),
and does not check "s(x,8)"
If direction="forward", it will only look at "(s(x,4)", and
so on.
This ordered behaviour was imposed in order to put some structure on the
search,
and reduce the computational and variance overhead of a complete search.
astrzelczak@ps.pl wrote:
>Dear Professor Hastie,
>
>
>I asked a question on r-help@stat.math.ethz.ch and I was told it'd be
better to
>contact you aboutmy problem.
>
>I'm working with step.gam in gam package. I'm interested both in
spline and
>loess functions and when I define all the models that I'm interested in
I get
>something like that:
>
>
>
>>gam.object.ALC<-gam(X143S~ALC,data=dane,family=binomial)
>>
>>
>>
>step.gam.ALC<-step.gam(gam.object.ALC,scope=list("ALC"=~1+ALC+s(ALC,2)+s(ALC,3)+s(ALC,4)+s(ALC,6)+s(ALC,8)+lo(ALC,degree=1,span=.5)+lo(ALC,degree=2,span=.5)+lo(ALC,degree=1,span=.25)+lo(ALC,degree=2,span=.25)))
> Start: X143S ~ ALC; AIC= 104.0815
> Trial: X143S ~ 1; AIC= 111.1054
> Trial: X143S ~ s(ALC, 2); AIC= 103.3325
> Step : X143S ~ s(ALC, 2) ; AIC= 103.3325
>
> Trial: X143S ~ s(ALC, 3); AIC= 102.9598
> Step : X143S ~ s(ALC, 3) ; AIC= 102.9598
>
> Trial: X143S ~ s(ALC, 4); AIC= 102.2103
> Step : X143S ~ s(ALC, 4) ; AIC= 102.2103
>
> Trial: X143S ~ s(ALC, 6); AIC= 102.4548
>
>I have impression that the algorithm stops when the next trial gives higher
AIC
>without examining further functions. When I deleted some of the spline
functions
>that were worse than s(ALC,4) I got:
>
>
>
>step.gam.ALC<-step.gam(gam.object.ALC,scope=list("ALC"=~1+ALC++s(ALC,4)+lo(ALC,degree=1,span=.5)+lo(ALC,degree=2,span=.5)+lo(ALC,degree=1,span=.25)+lo(ALC,degree=2,span=.25)))
> Start: X143S ~ ALC; AIC= 104.0815
> Trial: X143S ~ 1; AIC= 111.1054
> Trial: X143S ~ s(ALC, 4); AIC= 102.2103
> Step : X143S ~ s(ALC, 4) ; AIC= 102.2103
>
> Trial: X143S ~ lo(ALC, degree = 1, span = 0.5); AIC= 99.8127
> Step : X143S ~ lo(ALC, degree = 1, span = 0.5) ; AIC= 99.8127
>
> Trial: X143S ~ lo(ALC, degree = 2, span = 0.5); AIC= 100.5275
>
>Loess turned out to be better in this situation. Is there any way to examine
>all the models without stopping when AIC is higher in the next trial? How to
>handle this problem?
>
>I'd be grateful for any advise
>
>best regards
>
>Agnieszka Strzelczak, MSC
>
>PhD fellow
>Ministry of the Environment
>National Environmental Research Institute
>Velsä¾™vej 25
>P.O. Box 314
>DK-8600 Silkeborg
>Denmark
>Phone +45 89 20 14 00
>Fax +45 89 20 14 14
>e-mail: as@dmu.dk
>
>PhD student
>Institute of Chemistry and Environmental Protection
>Szczecin University of Technology
>Aleja Piastow 42
>71-065 Szczecin
>Phone +48 91 449 45 35
>e-mail: astrzelczak@ps.pl
>
>
--
--------------------------------------------------------------------
Trevor Hastie hastie@stanford.edu
Professor, Department of Statistics, Stanford University
Phone: (650) 725-2231 (Statistics) Fax: (650) 725-8977
(650) 498-5233 (Biostatistics) Fax: (650) 725-6951
URL: http://www-stat.stanford.edu/~hastie
address: room 104, Department of Statistics, Sequoia Hall
390 Serra Mall, Stanford University, CA 94305-4065
--------------------------------------------------------------------
[[alternative HTML version deleted]]