Hi folks, I originally tried R-SIG-Mixed-Models for this one (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q3/004170.html), but I think that the final steps to a solution aren't mixed-model specific, so I thought I'd ask my final questions here. I used gamm4 to fit a generalized additive mixed model to data from a AxBxC design, where A is a random effect (human participants in an experiment), B is a 2-level factor predictor variable, and C is a continuous variable that is likely non-linear. I tell gamm4 to fit a smooth across C to each level of B independently, and I can use predict.gam(...,se.fit=T) to obtain predictions from the fitted model as well as the standard error for the prediction. I'd like to visualize the BxC interaction to see if smoothing C within each level of B was really necessary, and if so, where it is (along the C dimension) that B affects the smooth. It's easy enough to obtain the predicted B1-B2 difference function, but I'm stuck on how to convey the uncertainty of this function (e.g. computing the confidence interval of the difference at each value of C). One thought is that predict.gam(...,se.fit=T) returns SE values, so if I could find out the N on which these SE values are computed, I could compute the difference CI as sqrt( ( (SD_B1)^2 + (SD_B2)^2 ) / N ) * qt( .975, df=N-1 ) However, I can't seem to figure out what value of N was used to compute the SEs that predict.gam(...,se.fit=T) produces. Can anyone point me to where I might find N? Further, is N-1 the proper df for the call to qt()? Finally, with a smooth function and 95% confidence intervals computed at each of a large number of points, don't I run into a problem of an inflated Type I error rate? Or does the fact that each point is not independent from those next to it make this an inappropriate concern? Cheers, Mike -- Mike Lawrence Graduate Student Department of Psychology Dalhousie University Looking to arrange a meeting? Check my public calendar: http://tr.im/mikes_public_calendar ~ Certainty is folly... I think. ~
On Thu, 2010-08-05 at 13:49 -0300, Mike Lawrence wrote:> Hi folks,I've been pondering this for a little while, drumming up courage to reply, whilst I've been away on fieldwork with patchy email access... so here goes with the neck sticking out thing... ;-)> I originally tried R-SIG-Mixed-Models for this one > (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q3/004170.html), > but I think that the final steps to a solution aren't mixed-model > specific, so I thought I'd ask my final questions here.Two things spring to mind: 1) I don't know enough to know if these models are nested but there should be a way to make them nested... Anyway, why not fit a model with a 'by' term and model without the 'by' term and compare them using a likelihood ratio test? Something like; m1 <- gamm(y ~ B + s(C, by = B), data = foo, random = list(A = ~ 1), method = "ML", ....) m2 <- gamm(y ~ s(C), data = foo, random = list(A = ~ 1), method = "ML", ....) anova(m1$lme, m2$lme) With s() smooths these won't be truly nested but you could also use te() smooths, but I don't know the specifics of how you can ensure these models are strictly nested. Simon Wood may be able to help you if you go down this route. 2) Do the whole thing in gam() which can also deal with simple random effects. You'll need to look at ?gam.models to see how to set these things up, but at least you won't have to deal with potential problems of fitting these things via nlme::lme. You'd still need to work out the nesting issue and then fit the models with the same formulas as above, but without the random bit.> I used gamm4 to fit a generalized additive mixed model to data from a > AxBxC design, where A is a random effect (human participants in an > experiment), B is a 2-level factor predictor variable, and C is a > continuous variable that is likely non-linear. I tell gamm4 to fit a > smooth across C to each level of B independently, and I can use > predict.gam(...,se.fit=T) to obtain predictions from the fitted model > as well as the standard error for the prediction. I'd like to > visualize the BxC interaction to see if smoothing C within each level > of B was really necessary, and if so, where it is (along the C > dimension) that B affects the smooth. It's easy enough to obtain the > predicted B1-B2 difference function, but I'm stuck on how to convey > the uncertainty of this function (e.g. computing the confidence > interval of the difference at each value of C). > > One thought is that predict.gam(...,se.fit=T) returns SE values, so if > I could find out the N on which these SE values are computed, I could > compute the difference CI as > sqrt( ( (SD_B1)^2 + (SD_B2)^2 ) / N ) * qt( .975, df=N-1 ) > > However, I can't seem to figure out what value of N was used to > compute the SEs that predict.gam(...,se.fit=T) produces. Can anyone > point me to where I might find N? > > Further, is N-1 the proper df for the call to qt()? > > Finally, with a smooth function and 95% confidence intervals computed > at each of a large number of points, don't I run into a problem of an > inflated Type I error rate? Or does the fact that each point is not > independent from those next to it make this an inappropriate concern?IIRC, yes. These will be pointwise confidence intervals and won't hold across the whole fitted smooth at the 95% level. Simon's book demonstrates two methods that could be used to generate simultaneous confidence intervals via simulation from the fitted models. The first involves simulating from a multivariate normal distribution with parameters given by the coefficients of the model. This will yield confidence intervals that are conditional upon the smoothing parameter (i.e. they will be a bit narrow as they don't take into account the fact that the smoothing parameter was estimated and not known a priori). To get round this, a parametric bootstrap can be used to simulate from the model. An example for a Poisson model is given in Simon's book for the mackerel egg example. In the back of my mind I recall reading something that is niggling me with regard to your approach and how you interpret these confidence intervals (perhaps this is just the pointwise ones...?). But I am not 100% sure that you can do the sort of comparison you want by evaluating if the curves are different by virtue of being outside the confidence intervals...? Anyway, I think the LRT approach and 'by' smooths might be a better way to go. Apologies if you've considered this by the way; I've not followed all the threads on SIG-Mixed. HTH G> > Cheers, > > Mike >-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
You can do this using the predict.gam(...,type="lpmatrix"). Here's some code providing an example.... ## example of smooths conditioned on factors from ?gam.models library(mgcv) dat <- gamSim(4) ## fit model... b <- gam(y ~ fac+s(x2,by=fac)+s(x0),data=dat) plot(b,pages=1) ## now predict on x2 grid at 2 levels of `fac'... x2 <- seq(0,1,length=100);x3 <- c(x2,x2) newd <- data.frame(x2=x3,x0=rep(.5,100),fac=c(rep("1",100),rep("2",100))) Xp <- predict(b,newd,type="lpmatrix") X <- Xp[1:100,]-Xp[101:200,] X[,1:3] <- 0;X[,22:39] <- 0 ## Now X%*%coef(b) gives difference between smooths for fac=="1" and ## fac=="2"... md <- X%*%coef(b) ## following evaluates diag(X%*%vcov(b)%*%t(X)) the s.e. of ## difference just computed.... se <- rowSums((X%*%vcov(b))*X) ## So getting upper and lower confidence limits is easy ## (could use qt(.975,b$df.residual) in place of 2 below) ul <- md + 2 * se;ll <- md - 2 * se ## plot smooths and difference par(mfrow=c(2,2));plot(b,select=1);plot(b,select=2) plot(x2,md,ylim=range(c(ul,ll)),type="l") lines(x2,ul,col=2);lines(x2,ll,col=2) On Thursday 05 August 2010 17:49, Mike Lawrence wrote:> Hi folks, > > I originally tried R-SIG-Mixed-Models for this one > (https://stat.ethz.ch/pipermail/r-sig-mixed-models/2010q3/004170.html), > but I think that the final steps to a solution aren't mixed-model > specific, so I thought I'd ask my final questions here. > > I used gamm4 to fit a generalized additive mixed model to data from a > AxBxC design, where A is a random effect (human participants in an > experiment), B is a 2-level factor predictor variable, and C is a > continuous variable that is likely non-linear. I tell gamm4 to fit a > smooth across C to each level of B independently, and I can use > predict.gam(...,se.fit=T) to obtain predictions from the fitted model > as well as the standard error for the prediction. I'd like to > visualize the BxC interaction to see if smoothing C within each level > of B was really necessary, and if so, where it is (along the C > dimension) that B affects the smooth. It's easy enough to obtain the > predicted B1-B2 difference function, but I'm stuck on how to convey > the uncertainty of this function (e.g. computing the confidence > interval of the difference at each value of C). > > One thought is that predict.gam(...,se.fit=T) returns SE values, so if > I could find out the N on which these SE values are computed, I could > compute the difference CI as > sqrt( ( (SD_B1)^2 + (SD_B2)^2 ) / N ) * qt( .975, df=N-1 ) > > However, I can't seem to figure out what value of N was used to > compute the SEs that predict.gam(...,se.fit=T) produces. Can anyone > point me to where I might find N? > > Further, is N-1 the proper df for the call to qt()? > > Finally, with a smooth function and 95% confidence intervals computed > at each of a large number of points, don't I run into a problem of an > inflated Type I error rate? Or does the fact that each point is not > independent from those next to it make this an inappropriate concern? > > Cheers, > > Mike--> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK > +44 1225 386603 www.maths.bath.ac.uk/~sw283