Anna Hargreaves
2014-Jan-25 15:29 UTC
[R] gam(mgcv) predicting across random effect and understanding narrow CI
Hi everyone, I am new to additive modelling and am surprised by the results of a model I'm working on. I wanted to check with more experienced users to make sure I'm not misunderstanding something basic. *Data:* I have 10 replicated runs from an evolutionary simulation model, measuring the evolved change in dispersal after an event (climate change) across a spatial environmental gradient. I want to find a fitted line showing mean change in dispersal across the gradient/model landscape with 95% confidence intervals, as a way of assessing in which parts of the gradient dispersal changed significantly (ie if 95% CI don't overlap 0). My data is: response = change in dispersal ('Ddif') predictor = position along spatial gradient 401 columns long ('col') random effect = replicate ('rep') The data are here: https://www.dropbox.com/sh/3xfjusgkit9bvt2/Ts2KjMOa7Y (There are NAs at the first and last columns of each run as simulation organisms died out at the extremes of the environmental gradient) *Question 1:* Because adding the random effect is drastically reducing my perceived sample size (from ~4000 data points to only 10 independent model runs), I expected a large increase in the width of the CI. However, there is almost no difference in the line or CI produced by the models: gam.nore <- gam(Ddif ~ s(col), data=trapzdc0, method="REML") gam.re <- gam(Ddif ~ s(col) + s(rep, bs="re"), data=trapzdc0, method="REML") There is also very little change in the R^2 and no change in n=3590 reported in the summary. I tried with gamm as well but still no noticeable change (I know that in this case I'm adding a random intercept instead of a random smoother but figured I'd try everything): gamm.re <- gamm(Ddif ~ s(col), random = list(rep =~ 1), data = trapzdc0) Why do the CI change so little? Is a random effect not the right way to account for non-independence of data in GAMs? Have I coded something incorrectly? *Question 2:* When I ask for a predicted line across replicates (ie summarizing the general patterns of the model), I get predicted lines that are 3590 long (ie across all data points), rather than 401 (the length of the predictor variable). This also makes me think I somehow haven't communicated the random effect properly to R. pgam.re <- as.data.frame(predict(gam.re, type='link', level=0, se.fit TRUE)) dim(pgam.re) #[1] 3590 2 I tried adding 'level=0', which asks for a global prediction across random effects in lmer, but no change. pgam.re <- as.data.frame(predict(gam.re, type='link', level=0, se.fit TRUE)) Even if I make a new data frame of the right length (401) across which the model should predict and a dummy 'rep' variable I get a predictor 3590 long: col <- c(seq(1,401,1)) rep <- c(rep(0,401)) newdat <- as.data.frame(cbind(col,rep)); dim(newdat) pgam.re2 <- as.data.frame(predict(gam.re, type='link', level=0, se.fit TRUE, data=newdat)) dim(pgam.re2) [1] 3590 2 How can I get a prediction (for use in plotting, eg ggplot) across replicates? I've spent days trolling for answers in help sites to no avail - any insight would be greatly appreciated! Thanks!! -- View this message in context: http://r.789695.n4.nabble.com/gam-mgcv-predicting-across-random-effect-and-understanding-narrow-CI-tp4684135.html Sent from the R help mailing list archive at Nabble.com.