Martijn Wieling
2012-Jun-02 16:25 UTC
[R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
Dear useRs, I reran an analysis with bam (mgcv, version 1.7-17) originally conducted using an older version of bam (mgcv, version 1.7-11) and this resulted in the same estimates, but much lower standard errors (in some cases 20 times as low) and lower p-values. This obviously results in a larger set of significant predictors. Is this result expected given the improvements in the new version? Or this a bug and are the p-values of bam in mgcv 1.7-17 too low? The summaries of both versions are shown below to enable a comparison. In addition, applying the default method="fREML" (mgcv version 1.7-17) on the same dataset yields only non-significant results, while all results are highly significant using method="REML". Furthermore, it also results in large negative (e.g., -8757) edf values linked to s(X,bs="RE") terms. Is this correct, or is this a bug? The summary of the model using method="fREML" is also shown below. I hope someone can shed some light on this. With kind regards, Martijn Wieling, University of Groningen ################################# ### mgcv version 1.7-11 ################################# Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + s(Word, bs = "re") + s(Key, bs = "re") Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.099757 0.020234 -4.930 8.23e-07 *** RefVratio.z 0.105705 0.013328 7.931 2.19e-15 *** IsSemiwordOrDemonstrative 0.289828 0.046413 6.245 4.27e-10 *** RefSoundCnt.z 0.119981 0.021202 5.659 1.53e-08 *** SpYearBirth.z -0.011396 0.002407 -4.734 2.21e-06 *** IsAragon 0.055678 0.033137 1.680 0.09291 . PopCntLog_residGeo.z -0.006504 0.003279 -1.984 0.04731 * SpYearBirth.z:IsAragon 0.015871 0.005459 2.907 0.00365 ** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Approximate significance of smooth terms: edf Ref.df F p-value s(GeoX,GeoY) 24.01 24.21 31.16 <2e-16 *** s(Word) 352.29 347.00 501.57 <2e-16 *** s(Key) 269.75 289.25 10.76 <2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 R-sq.(adj) = 0.693 Deviance explained = 69.4% REML score = -22476 Scale est. = 0.038177 n = 112608 ################################# ### mgcv version 1.7-17, much lower p-values and standard errors than version 1.7-11 ################################# Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + s(Word, bs = "re") + s(Key, bs = "re") Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.0997566 0.0014139 -70.552 < 2e-16 *** RefVratio.z 0.1057049 0.0006565 161.010 < 2e-16 *** IsSemiwordOrDemonstrative 0.2898285 0.0020388 142.155 < 2e-16 *** RefSoundCnt.z 0.1199813 0.0009381 127.905 < 2e-16 *** SpYearBirth.z -0.0113956 0.0006508 -17.509 < 2e-16 *** IsAragon 0.0556777 0.0057143 9.744 < 2e-16 *** PopCntLog_residGeo.z -0.0065037 0.0007938 -8.193 2.58e-16 *** SpYearBirth.z:IsAragon 0.0158712 0.0014829 10.703 < 2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Approximate significance of smooth terms: edf Ref.df F p-value s(GeoX,GeoY) 24.01 24.21 31.15 <2e-16 *** s(Word) 352.29 347.00 587.26 <2e-16 *** s(Key) 269.75 313.00 4246.76 <2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 R-sq.(adj) = 0.693 Deviance explained = 69.4% REML score = -22476 Scale est. = 0.038177 n = 112608 ################################# ### mgcv version 1.7-17, default: method="fREML", all p-values non-significant and negative edf's of s(X,bs="re") ################################# Family: gaussian Link function: identity Formula: RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + s(Word, bs = "re") + s(Key, bs = "re") Parametric coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -0.099757 1.730235 -0.058 0.954 RefVratio.z 0.105705 1.145329 0.092 0.926 IsSemiwordOrDemonstrative 0.289828 4.167237 0.070 0.945 RefSoundCnt.z 0.119981 1.901158 0.063 0.950 SpYearBirth.z -0.011396 0.034236 -0.333 0.739 IsAragon 0.055678 0.298629 0.186 0.852 PopCntLog_residGeo.z -0.006504 0.041981 -0.155 0.877 SpYearBirth.z:IsAragon 0.015871 0.077142 0.206 0.837 Approximate significance of smooth terms: edf Ref.df F p-value s(GeoX,GeoY) -1376 1 7.823 0.00516 ** s(Word) -8298 347 577.910 < 2e-16 *** s(Key) -1421 316 13.512 < 2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 R-sq.(adj) = 0.741 Deviance explained = 69.4% fREML score = -22476 Scale est. = 0.038177 n = 112608
Simon Wood
2012-Jun-02 22:17 UTC
[R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
the fREML results must be a bug. I'd modified the bam covariance and edf code in 1.7-17 and it looks like I must have messed something up. Any chance you could send me the data off line? Also can you try summary(...,freq=FALSE) under 1.7-17 and see if you get the same results as before? On 06/02/2012 05:25 PM, Martijn Wieling wrote:> Dear useRs, > > I reran an analysis with bam (mgcv, version 1.7-17) originally > conducted using an older version of bam (mgcv, version 1.7-11) and > this resulted in the same estimates, but much lower standard errors > (in some cases 20 times as low) and lower p-values. This obviously > results in a larger set of significant predictors. Is this result > expected given the improvements in the new version? Or this a bug and > are the p-values of bam in mgcv 1.7-17 too low? The summaries of both > versions are shown below to enable a comparison. > > In addition, applying the default method="fREML" (mgcv version 1.7-17) > on the same dataset yields only non-significant results, while all > results are highly significant using method="REML". Furthermore, it > also results in large negative (e.g., -8757) edf values linked to > s(X,bs="RE") terms. Is this correct, or is this a bug? The summary of > the model using method="fREML" is also shown below. > > I hope someone can shed some light on this. > > With kind regards, > Martijn Wieling, > University of Groningen > > ################################# > ### mgcv version 1.7-11 > ################################# > > Family: gaussian > Link function: identity > > Formula: > RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + > RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + > s(Word, bs = "re") + s(Key, bs = "re") > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -0.099757 0.020234 -4.930 8.23e-07 *** > RefVratio.z 0.105705 0.013328 7.931 2.19e-15 *** > IsSemiwordOrDemonstrative 0.289828 0.046413 6.245 4.27e-10 *** > RefSoundCnt.z 0.119981 0.021202 5.659 1.53e-08 *** > SpYearBirth.z -0.011396 0.002407 -4.734 2.21e-06 *** > IsAragon 0.055678 0.033137 1.680 0.09291 . > PopCntLog_residGeo.z -0.006504 0.003279 -1.984 0.04731 * > SpYearBirth.z:IsAragon 0.015871 0.005459 2.907 0.00365 ** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Approximate significance of smooth terms: > edf Ref.df F p-value > s(GeoX,GeoY) 24.01 24.21 31.16<2e-16 *** > s(Word) 352.29 347.00 501.57<2e-16 *** > s(Key) 269.75 289.25 10.76<2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > R-sq.(adj) = 0.693 Deviance explained = 69.4% > REML score = -22476 Scale est. = 0.038177 n = 112608 > > > ################################# > ### mgcv version 1.7-17, much lower p-values and standard errors than > version 1.7-11 > ################################# > > Family: gaussian > Link function: identity > > Formula: > RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + > RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + > s(Word, bs = "re") + s(Key, bs = "re") > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -0.0997566 0.0014139 -70.552< 2e-16 *** > RefVratio.z 0.1057049 0.0006565 161.010< 2e-16 *** > IsSemiwordOrDemonstrative 0.2898285 0.0020388 142.155< 2e-16 *** > RefSoundCnt.z 0.1199813 0.0009381 127.905< 2e-16 *** > SpYearBirth.z -0.0113956 0.0006508 -17.509< 2e-16 *** > IsAragon 0.0556777 0.0057143 9.744< 2e-16 *** > PopCntLog_residGeo.z -0.0065037 0.0007938 -8.193 2.58e-16 *** > SpYearBirth.z:IsAragon 0.0158712 0.0014829 10.703< 2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Approximate significance of smooth terms: > edf Ref.df F p-value > s(GeoX,GeoY) 24.01 24.21 31.15<2e-16 *** > s(Word) 352.29 347.00 587.26<2e-16 *** > s(Key) 269.75 313.00 4246.76<2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > R-sq.(adj) = 0.693 Deviance explained = 69.4% > REML score = -22476 Scale est. = 0.038177 n = 112608 > > > ################################# > ### mgcv version 1.7-17, default: method="fREML", all p-values > non-significant and negative edf's of s(X,bs="re") > ################################# > > Family: gaussian > Link function: identity > > Formula: > RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + > RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + > s(Word, bs = "re") + s(Key, bs = "re") > > Parametric coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) -0.099757 1.730235 -0.058 0.954 > RefVratio.z 0.105705 1.145329 0.092 0.926 > IsSemiwordOrDemonstrative 0.289828 4.167237 0.070 0.945 > RefSoundCnt.z 0.119981 1.901158 0.063 0.950 > SpYearBirth.z -0.011396 0.034236 -0.333 0.739 > IsAragon 0.055678 0.298629 0.186 0.852 > PopCntLog_residGeo.z -0.006504 0.041981 -0.155 0.877 > SpYearBirth.z:IsAragon 0.015871 0.077142 0.206 0.837 > > Approximate significance of smooth terms: > edf Ref.df F p-value > s(GeoX,GeoY) -1376 1 7.823 0.00516 ** > s(Word) -8298 347 577.910< 2e-16 *** > s(Key) -1421 316 13.512< 2e-16 *** > --- > Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > R-sq.(adj) = 0.741 Deviance explained = 69.4% > fREML score = -22476 Scale est. = 0.038177 n = 112608 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Martijn Wieling
2012-Jun-03 17:45 UTC
[R] mgcv (bam) very large standard error difference between versions 1.7-11 and 1.7-17, bug?
Dear useRs, I've ran some additional analyses (see below), which strongly suggest the standard errors of the bam (and gam) function are much too low in mgcv version 1.7-17, at least when including an s(X,bs="re") term. Until this issue has been clarified, it's perhaps best to use an older version of mgcv (unfortunately, however, in earlier versions the p-value calculation of s(X,bs="re") is not correct). All analyses were conducted in R 2.15.0. My approach was the following: I created a mixed-effects regression model with a single random intercept and only linear predictors. In my view, the results using lmer (lme4) should be comparable to those of bam and gam (mgcv). This was the case when using an older version of mgcv (version 1.7-13), but this is not the case anymore in version 1.7-17. In version 1.7-17, the standard errors and p-values are much lower and very similar to those of a linear model (which does not take the random-effects structure into account). The R-code and results are shown below. (The results using gam are not shown, but show the same pattern.) Furthermore, note that the differences in standard errors become less severe (but still noticeable) when less data is involved (e.g., using only 500 rows as opposed to >100.000). Finally, when not including an s(X,bs="re") term, but another non-random-effect smooth, the standard errors do not appear to be structurally lower (only for some variables, but not by a great deal - see also below). With kind regards, Martijn Wieling University of Groningen #### lme4 model (most recent version of lme4) modelLMER <- lmer(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + (1|Key), data=wrddst) # Estimate Std. Error t value #SpYearBirth.z -0.012084 0.004577 -2.640 #IsAragon 0.138959 0.010040 13.840 #SpIsMale -0.003087 0.008290 -0.372 #SpYearBirth.z:IsAragon 0.015429 0.010159 1.519 #### mgcv 1.7-13, default (method = "REML") - almost identical to modelLMER modelBAMold <- bam(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + s(Key,bs="re"), data=wrddst) # Estimate Std. Error t value Pr(>|t|) #SpYearBirth.z -0.012084 0.004578 -2.640 0.00829 ** #IsAragon 0.138959 0.010042 13.838 < 2e-16 *** #SpIsMale -0.003087 0.008292 -0.372 0.70968 #SpYearBirth.z:IsAragon 0.015429 0.010160 1.519 0.12886 #### mgcv 1.7-17, method = "REML" - standard errors greatly reduced # (comparable to standard errors of LM without random intercept) modelBAMnew <- bam(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + s(Key,bs="re"), data=wrddst); print(testje,cor=F) # Estimate Std. Error t value Pr(>|t|) #SpYearBirth.z -0.012084 0.001159 -10.428 < 2e-16 *** #IsAragon 0.138959 0.002551 54.472 < 2e-16 *** #SpIsMale -0.003087 0.002098 -1.471 0.141 #SpYearBirth.z:IsAragon 0.015429 0.002587 5.965 2.45e-09 *** #### lm results, standard errors comparable to mgcv 1.7-17 modelLM <- lm(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale, data=wrddst) # Estimate Std. Error t value Pr(>|t|) #(Intercept) -0.025779 0.001653 -15.595 < 2e-16 *** #SpYearBirth.z -0.011906 0.001182 -10.070 < 2e-16 *** #IsAragon 0.139323 0.002603 53.531 < 2e-16 *** #SpIsMale -0.003076 0.002140 -1.437 0.151 #SpYearBirth.z:IsAragon 0.015252 0.002639 5.780 7.49e-09 *** #### mgcv 1.7-17, default (method = "fREML") - completely different from previous models modelBAMfREML <- bam(RefPMIdistMeanLog.c ~ SpYearBirth.z*IsAragon + SpIsMale + s(Key,bs="re"), data=wrddst); print(testje,cor=F) # Estimate Std. Error t value Pr(>|t|) #(Intercept) -0.025391 0.106897 -0.238 0.812 #SpYearBirth.z -0.012084 0.076300 -0.158 0.874 #IsAragon 0.138959 0.166697 0.834 0.405 #SpIsMale -0.003087 0.138291 -0.022 0.982 #SpYearBirth.z:IsAragon 0.015429 0.168260 0.092 0.927 # #Approximate significance of smooth terms: # edf Ref.df F p-value #s(Key) -38.95 310 15.67 <2e-16 *** #### differences w.r.t. standard smooths #### mgcv version 1.7-13 m2old <- bam(RefPMIdistMeanLog.c ~ s(GeoX,GeoY) + SpYearBirth.z*IsAragon + SpIsMale, data=wrddst, method="REML") ## RESULTS #Family: gaussian #Link function: identity # #Formula: #RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + SpYearBirth.z * IsAragon + # SpIsMale # #Parametric coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) -0.001386 0.004982 -0.278 0.7809 #SpYearBirth.z -0.012950 0.001167 -11.097 < 2e-16 *** #IsAragon 0.020532 0.023608 0.870 0.3845 #SpIsMale -0.004788 0.002219 -2.158 0.0309 * #SpYearBirth.z:IsAragon 0.015611 0.002600 6.005 1.92e-09 *** #--- #Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 # #Approximate significance of smooth terms: # edf Ref.df F p-value #s(GeoX,GeoY) 27.11 28.14 126.2 <2e-16 *** #--- #Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 # #R-sq.(adj) = 0.0555 Deviance explained = 5.58% #REML score = 39232 Scale est. = 0.11734 n = 112608 #### mgcv version 1.7-17 m2new <- bam(RefPMIdistMeanLog.c ~ s(GeoX,GeoY) + SpYearBirth.z*IsAragon + SpIsMale, data=wrddst, method="REML") #Family: gaussian #Link function: identity # #Formula: #RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + SpYearBirth.z * IsAragon + # SpIsMale # #Parametric coefficients: # Estimate Std. Error t value Pr(>|t|) #(Intercept) -0.001388 0.003938 -0.352 0.7245 #SpYearBirth.z -0.012950 0.001167 -11.098 < 2e-16 *** #IsAragon 0.020543 0.018055 1.138 0.2552 #SpIsMale -0.004788 0.002215 -2.161 0.0307 * #SpYearBirth.z:IsAragon 0.015611 0.002600 6.005 1.92e-09 *** #--- #Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 # #Approximate significance of smooth terms: # edf Ref.df F p-value #s(GeoX,GeoY) 27.11 28.14 126.2 <2e-16 *** #--- #Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 # #R-sq.(adj) = 0.0555 Deviance explained = 5.58% #REML score = 39232 Scale est. = 0.11734 n = 112608 On Sat, Jun 2, 2012 at 6:25 PM, Martijn Wieling <wieling at gmail.com> wrote:> Dear useRs, > > I reran an analysis with bam (mgcv, version 1.7-17) originally > conducted using an older version of bam (mgcv, version 1.7-11) and > this resulted in the same estimates, but much lower standard errors > (in some cases 20 times as low) and lower p-values. This obviously > results in a larger set of significant predictors. Is this result > expected given the improvements in the new version? Or this a bug and > are the p-values of bam in mgcv 1.7-17 too low? The summaries of both > versions are shown below to enable a comparison. > > In addition, applying the default method="fREML" (mgcv version 1.7-17) > on the same dataset yields only non-significant results, while all > results are highly significant using method="REML". Furthermore, it > also results in large negative (e.g., -8757) edf values linked to > s(X,bs="RE") terms. Is this correct, or is this a bug? The summary of > the model using method="fREML" is also shown below. > > I hope someone can shed some light on this. > > With kind regards, > Martijn Wieling, > University of Groningen > > ################################# > ### mgcv version 1.7-11 > ################################# > > Family: gaussian > Link function: identity > > Formula: > RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + > ? ?RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + > ? ?s(Word, bs = "re") + s(Key, bs = "re") > > Parametric coefficients: > ? ? ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|) > (Intercept) ? ? ? ? ? ? ? -0.099757 ? 0.020234 ?-4.930 8.23e-07 *** > RefVratio.z ? ? ? ? ? ? ? ?0.105705 ? 0.013328 ? 7.931 2.19e-15 *** > IsSemiwordOrDemonstrative ?0.289828 ? 0.046413 ? 6.245 4.27e-10 *** > RefSoundCnt.z ? ? ? ? ? ? ?0.119981 ? 0.021202 ? 5.659 1.53e-08 *** > SpYearBirth.z ? ? ? ? ? ? -0.011396 ? 0.002407 ?-4.734 2.21e-06 *** > IsAragon ? ? ? ? ? ? ? ? ? 0.055678 ? 0.033137 ? 1.680 ?0.09291 . > PopCntLog_residGeo.z ? ? ?-0.006504 ? 0.003279 ?-1.984 ?0.04731 * > SpYearBirth.z:IsAragon ? ? 0.015871 ? 0.005459 ? 2.907 ?0.00365 ** > --- > Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Approximate significance of smooth terms: > ? ? ? ? ? ? ? ?edf Ref.df ? ? ?F p-value > s(GeoX,GeoY) ?24.01 ?24.21 ?31.16 ?<2e-16 *** > s(Word) ? ? ?352.29 347.00 501.57 ?<2e-16 *** > s(Key) ? ? ? 269.75 289.25 ?10.76 ?<2e-16 *** > --- > Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > R-sq.(adj) = ?0.693 ? Deviance explained = 69.4% > REML score = -22476 ?Scale est. = 0.038177 ?n = 112608 > > > ################################# > ### mgcv version 1.7-17, much lower p-values and standard errors than > version 1.7-11 > ################################# > > Family: gaussian > Link function: identity > > Formula: > RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + > ? ?RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + > ? ?s(Word, bs = "re") + s(Key, bs = "re") > > Parametric coefficients: > ? ? ? ? ? ? ? ? ? ? ? ? ? ?Estimate Std. Error t value Pr(>|t|) > (Intercept) ? ? ? ? ? ? ? -0.0997566 ?0.0014139 -70.552 ?< 2e-16 *** > RefVratio.z ? ? ? ? ? ? ? ?0.1057049 ?0.0006565 161.010 ?< 2e-16 *** > IsSemiwordOrDemonstrative ?0.2898285 ?0.0020388 142.155 ?< 2e-16 *** > RefSoundCnt.z ? ? ? ? ? ? ?0.1199813 ?0.0009381 127.905 ?< 2e-16 *** > SpYearBirth.z ? ? ? ? ? ? -0.0113956 ?0.0006508 -17.509 ?< 2e-16 *** > IsAragon ? ? ? ? ? ? ? ? ? 0.0556777 ?0.0057143 ? 9.744 ?< 2e-16 *** > PopCntLog_residGeo.z ? ? ?-0.0065037 ?0.0007938 ?-8.193 2.58e-16 *** > SpYearBirth.z:IsAragon ? ? 0.0158712 ?0.0014829 ?10.703 ?< 2e-16 *** > --- > Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > Approximate significance of smooth terms: > ? ? ? ? ? ? ? ?edf Ref.df ? ? ? F p-value > s(GeoX,GeoY) ?24.01 ?24.21 ? 31.15 ?<2e-16 *** > s(Word) ? ? ?352.29 347.00 ?587.26 ?<2e-16 *** > s(Key) ? ? ? 269.75 313.00 4246.76 ?<2e-16 *** > --- > Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > R-sq.(adj) = ?0.693 ? Deviance explained = 69.4% > REML score = -22476 ?Scale est. = 0.038177 ?n = 112608 > > > ################################# > ### mgcv version 1.7-17, default: method="fREML", all p-values > non-significant and negative edf's of s(X,bs="re") > ################################# > > Family: gaussian > Link function: identity > > Formula: > RefPMIdistMeanLog.c ~ s(GeoX, GeoY) + RefVratio.z + IsSemiwordOrDemonstrative + > ? ?RefSoundCnt.z + SpYearBirth.z * IsAragon + PopCntLog_residGeo.z + > ? ?s(Word, bs = "re") + s(Key, bs = "re") > > Parametric coefficients: > ? ? ? ? ? ? ? ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|) > (Intercept) ? ? ? ? ? ? ? -0.099757 ? 1.730235 ?-0.058 ? ?0.954 > RefVratio.z ? ? ? ? ? ? ? ?0.105705 ? 1.145329 ? 0.092 ? ?0.926 > IsSemiwordOrDemonstrative ?0.289828 ? 4.167237 ? 0.070 ? ?0.945 > RefSoundCnt.z ? ? ? ? ? ? ?0.119981 ? 1.901158 ? 0.063 ? ?0.950 > SpYearBirth.z ? ? ? ? ? ? -0.011396 ? 0.034236 ?-0.333 ? ?0.739 > IsAragon ? ? ? ? ? ? ? ? ? 0.055678 ? 0.298629 ? 0.186 ? ?0.852 > PopCntLog_residGeo.z ? ? ?-0.006504 ? 0.041981 ?-0.155 ? ?0.877 > SpYearBirth.z:IsAragon ? ? 0.015871 ? 0.077142 ? 0.206 ? ?0.837 > > Approximate significance of smooth terms: > ? ? ? ? ? ? ? edf Ref.df ? ? ? F p-value > s(GeoX,GeoY) -1376 ? ? ?1 ? 7.823 0.00516 ** > s(Word) ? ? ?-8298 ? ?347 577.910 < 2e-16 *** > s(Key) ? ? ? -1421 ? ?316 ?13.512 < 2e-16 *** > --- > Signif. codes: ?0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 > > R-sq.(adj) = ?0.741 ? Deviance explained = 69.4% > fREML score = -22476 ?Scale est. = 0.038177 ?n = 112608