robert.ptacnik at niva.no
2007-Jun-15 07:06 UTC
[R] interpretation of F-statistics in GAMs
dear listers, I use gam (from mgcv) for evaluation of shape and strength of relationships between a response variable and several predictors. How can I interpret the 'F' values viven in the GAM summary? Is it appropriate to treat them in a similar manner as the T-statistics in a linear model, i.e. larger values mean that this variable has a stronger impact than a variable with smaller F? When I run my analysis for two different response varables (but identical predictors), is there a way to compare the F values among tests (like to standardize them by teh sum of F within each test?) I append two summaries below. Thanks in advance, Robert ### example 1 ### Family: gaussian Link function: identity Formula: dep[sel, i] ~ s(date, k = 3) + s(depth, k = kn) + s(temp, k = kn) + s(light, k = kn) + s(PO4, k = kn) + s(DIN, k = kn) + s(prop.agpla, k = kn) Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.1048 0.0384 132.9 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(date) 1.669 2 12.161 1.07e-05 *** s(depth) 1.671 2 36.125 4.85e-14 *** s(temp) 1.927 2 6.686 0.00156 ** s(light) 1.886 2 12.604 7.20e-06 *** s(PO4) 1.676 2 3.237 0.04143 * s(DIN) 1.000 1 38.428 3.41e-09 *** s(prop.agpla) 1.405 2 15.987 3.79e-07 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.687 Deviance explained = 70.5% GCV score = 0.31995 Scale est. = 0.30076 n = 204 ### example 2 ### Family: gaussian Link function: identity Formula: dep[sel, i] ~ s(date, k = 3) + s(depth, k = kn) + s(temp, k = kn) + s(light, k = kn) + s(PO4, k = kn) + s(DIN, k = kn) + s(prop.agpla, k = kn) Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 7.13588 0.05549 128.6 <2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Approximate significance of smooth terms: edf Est.rank F p-value s(date) 1.944 2 15.997 3.67e-07 *** s(depth) 1.876 2 25.427 1.52e-10 *** s(temp) 1.000 1 2.866 0.0921 . s(light) 1.751 2 4.212 0.0162 * s(PO4) 1.950 2 10.632 4.14e-05 *** s(DIN) 1.805 2 10.745 3.73e-05 *** s(prop.agpla) 1.715 2 2.674 0.0715 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 R-sq.(adj) = 0.479 Deviance explained = 50.9% GCV score = 0.6863 Scale est. = 0.64348 n = 209
On Friday 15 June 2007 08:06, robert.ptacnik at niva.no wrote:> dear listers, > I use gam (from mgcv) for evaluation of shape and strength of relationships > between a response variable and several predictors. > How can I interpret the 'F' values viven in the GAM summary? Is it > appropriate to treat them in a similar manner as the T-statistics in a > linear model, i.e. larger values mean that this variable has a stronger > impact than a variable with smaller F?- I'd be a bit cautious about this (even for T-statistics and linear models it's not quite clear to me what `impact' means if judged this way). These gam F statistics are only meant to provide a rough and ready means of judging approximate significance of terms, and I'm unsure about interpreting a comparison of such F ratios: for example the F statistics can be based on differerent numbers of degrees of freedom, depending on the term concerned...> When I run my analysis for two different response varables (but identical > predictors), is there a way to compare the F values among tests (like to > standardize them by teh sum of F within each test?) I append two summaries > below.- Again, I don't really known how this would work. I'd be more inclined to compare the plotted terms and associated CIs (and maybe the p-values), especially if you are using GAMs in a quite exploratory way (e.g. if the assumption of an additive structure is really a convenience, rather than being something that is suggested by the underlying science). best, Simon> > > ### example 1 ### > > Family: gaussian > Link function: identity > > Formula: > dep[sel, i] ~ s(date, k = 3) + s(depth, k = kn) + s(temp, k = kn) + > s(light, k = kn) + s(PO4, k = kn) + s(DIN, k = kn) + s(prop.agpla, > k = kn) > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 5.1048 0.0384 132.9 <2e-16 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Approximate significance of smooth terms: > edf Est.rank F p-value > s(date) 1.669 2 12.161 1.07e-05 *** > s(depth) 1.671 2 36.125 4.85e-14 *** > s(temp) 1.927 2 6.686 0.00156 ** > s(light) 1.886 2 12.604 7.20e-06 *** > s(PO4) 1.676 2 3.237 0.04143 * > s(DIN) 1.000 1 38.428 3.41e-09 *** > s(prop.agpla) 1.405 2 15.987 3.79e-07 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > R-sq.(adj) = 0.687 Deviance explained = 70.5% > GCV score = 0.31995 Scale est. = 0.30076 n = 204 > > ### example 2 ### > Family: gaussian > Link function: identity > > Formula: > dep[sel, i] ~ s(date, k = 3) + s(depth, k = kn) + s(temp, k = kn) + > s(light, k = kn) + s(PO4, k = kn) + s(DIN, k = kn) + s(prop.agpla, > k = kn) > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 7.13588 0.05549 128.6 <2e-16 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Approximate significance of smooth terms: > edf Est.rank F p-value > s(date) 1.944 2 15.997 3.67e-07 *** > s(depth) 1.876 2 25.427 1.52e-10 *** > s(temp) 1.000 1 2.866 0.0921 . > s(light) 1.751 2 4.212 0.0162 * > s(PO4) 1.950 2 10.632 4.14e-05 *** > s(DIN) 1.805 2 10.745 3.73e-05 *** > s(prop.agpla) 1.715 2 2.674 0.0715 . > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > R-sq.(adj) = 0.479 Deviance explained = 50.9% > GCV score = 0.6863 Scale est. = 0.64348 n = 209 > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide commented, minimal, > self-contained, reproducible code.--> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > +44 1225 386603 www.maths.bath.ac.uk/~sw283
Possibly Parallel Threads
- matlab/gauss code in R
- Creating "%d/%m/%Y %H:%M:%S" format from separate date and time columns
- general question about dropping terms of glm model fits
- sapply(pred,cor,y=resp)
- [PATCH driver-core-linus] kernfs: kernfs_notify() must be useable from non-sleepable contexts