thr3ads.net - R devel - [Rd] Formulas in gam function of mgcv package [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Corrado

2009-Aug-24 16:02 UTC

[Rd] Formulas in gam function of mgcv package

Dear R-experts,

I have a question on the formulas used in the gam function of the mgcv 
package.

I am trying to understand the relationships between:

y~s(x1)+s(x2)+s(x3)+s(x4)

and 

y~s(x1,x2,x3,x4)

Does the latter contain the former? what about the smoothers of all 
interaction terms? 

I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, 
linear.functional.terms but could not understand properly.

Regards
-- 
Corrado Topi

Global Climate Change & Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529 at york.ac.uk

Gavin Simpson

2009-Aug-24 16:33 UTC

head link

[R] [Rd] Formulas in gam function of mgcv package

[Note R-Devel is the wrong list for such questions. R-Help is where this
should have been directed - redirected there now]

On Mon, 2009-08-24 at 17:02 +0100, Corrado wrote:> Dear R-experts,
> 
> I have a question on the formulas used in the gam function of the mgcv 
> package.
> 
> I am trying to understand the relationships between:
> 
> y~s(x1)+s(x2)+s(x3)+s(x4)
> 
> and 
> 
> y~s(x1,x2,x3,x4)
> 
> Does the latter contain the former? what about the smoothers of all 
> interaction terms?
I'm not 100% certain how this scales to smooths of more than 2
variables, but Sections 4.10.2 and 5.2.2 of Simon Wood's book GAM: An
Introduction with R (2006, Chapman Hall/CRC) discuss this for smooths of
2 variables.

Strictly y ~ s(x1) + s(x2) is not nested in y ~ s(x1, x2) as the bases
used to produce the smoothers in the two models may not be the same in
both models. One option to ensure nestedness is to fit the more
complicated model as something like this:

## if simpler model were: y ~ s(x1, k=20) + s(x2, k = 20)
y ~ s(x1, k=20) + s(x2, k = 20) + s(x1, x2, k = 60)
                                  ^^^^^^^^^^^^^^^^^ 
where the last term (^^^ above) has the same k as used in s(x1, x2)

Note that these are isotropic smooths; are x1 and x2 measured in the
same units etc.? Tensor product smooths may be more appropriate if not,
and if we specify the bases when fitting models s(x1) + s(x2) *is*
strictly nested in te(x1, x2), eg.

y ~ s(x1, bs = "cr", k = 10) + s(x2, bs = "cr", k = 10)

is strictly nested within

y ~ te(x1, x2, k = 10)
## is the same as y ~ te(x1, x2, bs = "cr", k = 10)

[Note that bs = "cr" is the default basis in te() smooths, hence we
don't need to specify it, and k = 10 refers to each individual smooth in
the te().]

HTH

G
>  
> 
> I have (tried to) read the manual pages of gam, formula.gam, smooth.terms, 
> linear.functional.terms but could not understand properly.
> 
> Regards-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson             [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,          [f] +44 (0)20 7679 0565
 Pearson Building,             [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London          [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT.                 [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

Corrado

2009-Aug-26 10:13 UTC

head link

[R] [Rd] Formulas in gam function of mgcv package

Dear Simon,

thanks for your answer.

I am running the model with both s and te smoothing, to compare.

A few questions on your email:

1) Isotropic smoothness: my variables are centred and scaled. I assumed an 
isotropic smoother (that is, a smoother that treats all the variables in the 
same way) was good. What do you think? Is my understanding of isotropic 
smoothing wrong? 

2) s(x1,...., xn): it does not contains (1), but I thought it was true that it 
does improve on (1) by being free of including some interaction, albeit not 
explicitly .... is my interpretation wrong?

3) te: I am confused! What does it mean that the function space for (4) is 
built up from the function spaces used in (3)? Does it mean that 
te(xi,....,xn) is an expansion on the te(xi), including all the terms 
te(x1)*te(x2)*....*te(xj)*....*te(xn) of the different orders?

Example: in the case of 4 variables, including te(x1)*te(x2), te(x2)*te(x3), 
.... te(x1)*te(x2)*te(x3) .... to te(x1)*te(x2)*te(x3)*te(x4) .....

Sorry for being particularly daft ....

Regards


On Wednesday 26 August 2009 09:56:13 you wrote:> > > I am trying to understand the relationships between:
> > >
> > > y~s(x1)+s(x2)+s(x3)+s(x4)
> > >
> > > and
> > >
> > > y~s(x1,x2,x3,x4)
> > >
> > > Does the latter contain the former? what about the smoothers of
all
> > > interaction terms?
>
> The first says that you want a model
> E(y) = f_1(x_1) + f_2(x_2) + f_3(x_3) + f_4(x_4) (1)
> where the f_j are smooth functions. The additive decomposition is quite a
> strong assumption, since it assumes that the effect of x_j is not dependent
> on x_k unless j=k. The second model is just
> E(y) = f(x_1,x_2,x_3,x4)                                          (2)
> where f is a smooth function. This looks very general, but actually `s'
> terms assume isotropic smoothness, which is also quite a strong assumption.
>
> Now if I simply state that f and the f_j are `smooth functions', and
leave
> it at that, then (2) would of course contain (1), but to actually estimate
> the models I need to state, mathematically, what I mean by `smooth'.
Once
> I've done that I've pretty much determined the function spaces in
which f
> and the f_j will lie, and in general (2) will no longer strictly contain
> (1). mgcv's `s' terms use a thin plate spline measure of smoothness
for
> multivariate smooths, and this means that (1) will not be strictly nested
> within (2), since e.g. a 4D thin plate spline can not generally represent
> exactly what the sum of 4 1D splines can represent.
>
> If you want to acheive exact nesting then using tensor product smooths with
> something like
>
> y~te(x1)+te(x2)+te(x3)+te(x4)   (3)
>
> y~te(x1,x2,x3,x4)                         (4)
>
> will do the trick (because the function space for (4) is built up from the
> function spaces used in (3)).
>
> As to where all the 2 and 3 way interactions have gone in (4)... it's
just
> like ANOVA - if you put in a 4 way interaction then the lower order
> interactions are not identifiable, unless you choose to add constraints to
> make them so. `mgcv' will allow you add main effects and interactions,
and
> will handle the constraints automatically, but if this sort of functional
> ANOVA is a major component of what you want to do, then it is probably
> worth checking out the gss package and Chong Gu's book on smoothing
spline
> ANOVA.
>
> best,
> Simon


-- 
Corrado Topi

Global Climate Change & Biodiversity Indicators
Area 18,Department of Biology
University of York, York, YO10 5YW, UK
Phone: + 44 (0) 1904 328645, E-mail: ct529 at york.ac.uk

Apparently Analagous Threads

Search for more maybe matching threads

R devel - Aug 2009 - Formulas in gam function of mgcv package

[Rd] Formulas in gam function of mgcv package

[R] [Rd] Formulas in gam function of mgcv package

[R] [Rd] Formulas in gam function of mgcv package

Apparently Analagous Threads