Mike -
I observe that you have dropped by far the most significant single
predictor in going from the first to the second model. If I had
to guess, I would guess that the remaining predictor variables are
either binary indicator variables or else have only a handful of
distinct values. Can't dignose much more than that in the absence
of the actual data.
If it were my problem, I would plot the response against each
predictor, also residuals vs. fitted values for each model, and
do some graphical data analysis to diagnose what's going on.
I encourage you to do this for yourself.
- tom blackwell -
On Mon, 6 Jan 2003, Michael F. Palopoli wrote:
> Dear R experts,
>
> I'm hoping someone can help me to interpret the results of building
> gam's with mgcv in R.
>
> Below are summaries of two gam's based on the same dataset. The first
> gam (named "gam.mod") has six predictor variables. The second
gam
> (named "gam.mod2") is exactly the same except it is missing one
of the
> predictor variables. What is confusing me is the estimated defrees of
> freedom for each of the splines in the second model....
>
> ________________
>
> > summary.gam(mod.gam)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> INT ~ s(IGS) + s(L2E) + s(TED) + s(PSD) + s(OPD) + s(GED)
>
> Parametric coefficients:
> Estimate std. err. t ratio Pr(>|t|)
> constant 302.32 5.192 58.23 < 2.22e-16
>
> Approximate significance of smooth terms:
> edf chi.sq p-value
> s(IGS) 4.254 58.308 9.5524e-12
> s(L2E) 1 8.7673 0.0030668
> s(TED) 1 8.3915 0.0037697
> s(PSD) 1 6.0234 0.014118
> s(OPD) 2.289 12.745 0.0024349
> s(GED) 3.791 152.68 < 2.22e-16
>
> R-sq.(adj) = 0.885 Deviance explained = 91.1%
> GCV score = 2124.9 Scale est. = 1617.3 n = 60
>
> ________________
>
> >summary.gam(mod.gam2)
>
> Family: gaussian
> Link function: identity
>
> Formula:
> INT ~ s(IGS) + s(L2E) + s(TED) + s(PSD) + s(OPD)
>
> Parametric coefficients:
> Estimate std. err. t ratio Pr(>|t|)
> constant 302.32 4.736e-14 6.384e+15 < 2.22e-16
>
> Approximate significance of smooth terms:
> edf chi.sq p-value
> s(IGS) 1.757e-05 1.3524e+09 < 2.22e-16
> s(L2E) 0.009991 0.21394 0.6437
> s(TED) 2.945e-05 1.4913e+07 < 2.22e-16
> s(PSD) 2.566e-05 6.5495e+06 < 2.22e-16
> s(OPD) 5.023e-05 3.2332e+07 < 2.22e-16
>
> R-sq.(adj) = 0.645 Deviance explained = 64.5%
> GCV score = 7489.7 Scale est. = 6069.7 n = 60
>
>
> ________________
>
>
> Any suggestions about either (1) what went wrong with the second model?
> or (2) how the heck do I interpet these results?
>
> Thanks,
>
> Mike.
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-help
>