Andrew Crane-Droesch
2013-Apr-16 21:35 UTC
[R] Understanding why a GAM can't have an intercept
Dear List,
I've just tried to specify a GAM without an intercept -- I've got one of
the (rare) cases where it is appropriate for E(y) -> 0 as X ->0.
Naively running a GAM with the "-1" appended to the formula and the
calling "predict.gam", I see that the model isn't behaving as
expected.
I don't understand why this would be. Google turns up this old R help
thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html
Simon writes:
*Smooth terms are constrained to sum to zero over the covariate
values. **
**This is an identifiability constraint designed to avoid
confounding with **
**the intercept (particularly important if you have more than one
smooth). *
If you remove the intercept from you model altogether (m2) then the
smooth will still sum to zero over the covariate values, which in your
case will mean that the smooth is quite a long way from the data. When
you include the intercept (m1) then the intercept is effectively
shifting the constrained curve up towards the data, and you get a
nice fit.
Why? I haven't read Simon's book in great detail, though I have read
Ruppert et al.'s Semiparametric Regression. I don't see a reason why a
penalized spline model shouldn't equal the intercept (or zero) when all
of the regressors equals zero.
Is anyone able to help with a bit of intuition? Or relevant passages
from a good description of why this would be the case?
Furthermore, why does the "-1" formula specification work if it
doesn't
work "as intended" by for example lm?
Many thanks,
Andrew
[[alternative HTML version deleted]]
Andrew Crane-Droesch
2013-Apr-16 21:36 UTC
[R] Understanding why a GAM can't suppress an intercept
> Dear List, > > I've just tried to specify a GAM without an intercept -- I've got one > of the (rare) cases where it is appropriate for E(y) -> 0 as X ->0. > Naively running a GAM with the "-1" appended to the formula and the > calling "predict.gam", I see that the model isn't behaving as expected. > > I don't understand why this would be. Google turns up this old R help > thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html > > Simon writes: > > *Smooth terms are constrained to sum to zero over the covariate > values. ** > **This is an identifiability constraint designed to avoid > confounding with ** > **the intercept (particularly important if you have more than one > smooth). * > If you remove the intercept from you model altogether (m2) then the > smooth will still sum to zero over the covariate values, which in > your > case will mean that the smooth is quite a long way from the data. > When > you include the intercept (m1) then the intercept is effectively > shifting the constrained curve up towards the data, and you get a > nice fit. > > Why? I haven't read Simon's book in great detail, though I have read > Ruppert et al.'s Semiparametric Regression. I don't see a reason why > a penalized spline model shouldn't equal the intercept (or zero) when > all of the regressors equals zero. > > Is anyone able to help with a bit of intuition? Or relevant passages > from a good description of why this would be the case? > > Furthermore, why does the "-1" formula specification work if it > doesn't work "as intended" by for example lm? > > Many thanks, > Andrew > > >[[alternative HTML version deleted]]
Andrew Crane-Droesch
2013-Apr-16 21:36 UTC
[R] Understanding why a GAM can't have an intercept
please deleter this thread -- wrong title On 04/16/2013 02:35 PM, Andrew Crane-Droesch wrote:> Dear List, > > I've just tried to specify a GAM without an intercept -- I've got one > of the (rare) cases where it is appropriate for E(y) -> 0 as X ->0. > Naively running a GAM with the "-1" appended to the formula and the > calling "predict.gam", I see that the model isn't behaving as expected. > > I don't understand why this would be. Google turns up this old R help > thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html > > Simon writes: > > *Smooth terms are constrained to sum to zero over the covariate > values. ** > **This is an identifiability constraint designed to avoid > confounding with ** > **the intercept (particularly important if you have more than one > smooth). * > If you remove the intercept from you model altogether (m2) then the > smooth will still sum to zero over the covariate values, which in > your > case will mean that the smooth is quite a long way from the data. > When > you include the intercept (m1) then the intercept is effectively > shifting the constrained curve up towards the data, and you get a > nice fit. > > Why? I haven't read Simon's book in great detail, though I have read > Ruppert et al.'s Semiparametric Regression. I don't see a reason why > a penalized spline model shouldn't equal the intercept (or zero) when > all of the regressors equals zero. > > Is anyone able to help with a bit of intuition? Or relevant passages > from a good description of why this would be the case? > > Furthermore, why does the "-1" formula specification work if it > doesn't work "as intended" by for example lm? > > Many thanks, > Andrew > > >[[alternative HTML version deleted]]