Julien Riou
2014-May-20 10:05 UTC
[R] Variance of predictions from lme4 models at existing levels?
Dear contributors, I know that this has been widely discussed, but even after having read many discussions on this matter, I'm still not sure if I'm understanding properly. So I have a dataset of studies reporting prevalence in several settings, here is an exemple: data<-data.frame(id_study=c("UK1","UK2","UK3","FRA1","FRA2","BEL1","GER1","GER2","GER3"), country=c("UK","UK","UK","FRANCE","FRANCE","BELGIUM","GERMANY","GERMANY","GERMANY"), n_events=c(130,49,18,10,50,6,34,20,51), n_total=c(3041,580,252,480,887,256,400,206,300), study_median_age=c(25,50,58,30,42,26,27,28,36), pop_type=c(2,0,1,0,0,0,1,0,2)) data$prevalence<-data$n_events/data$n_total Since the prevalence is linked to the sample age and type, the objective is to estimate the predicted prevalence for each study for a sample with standardized median age (30) and type ("A"). Then these predictions will be pooled by country using a Dersimonian & Laird method. This is the model I chose: mod<-glmer(cbind(n_events, n_total-n_events) ~ study_median_age + pop_type + (1|country) + (1|id_study), data=data, family="binomial"(link=logit)) I tried to adapt the method from the r-sig-mixed-models FAQ<http://glmm.wikidot.com/faq> in order to get the variance of the estimate, which will be used in the weighting of the Dersimonian & Laird meta-analysis: newdat=data.frame(id_study=data$id_study, country=data$country, study_median_age=30, pop_type="A", n_events=0, n_total=data$n_total) pred<-predict(mod,newdat[,1:4],type="link",REform=~(1|country) + (1|id_study)) mm <- model.matrix(terms(mod),newdat) newdat$distance <- mm %*% fixef(mod) pvar1 <- diag(mm %*% tcrossprod(vcov(mod),mm)) tvar1 <- pvar1 + VarCorr(mod)$country[1] + VarCorr(mod)$id_study[1] If I understood correctly, tvar1 is an estimation of the variance of the prediction, taking into account the uncertainty on the two random intercepts. I can consider tvar1 as the variance of the prediction for the further meta-analysis, and also for calculation of the confidence interval of the prediction. data$prediction<-mod@resp$family$linkinv(pred) data$predictionSE2<-tvar1 data$predictionCI<-paste0("[",round(mod@resp$family$linkinv(pred-1.96*tvar1),4),";", round(mod@resp$family$linkinv(pred+1.96*tvar1),4),"]") data So my questions are: - Did I properly adapt the method for the calculation of the variance to this model with 2 random intercepts? - Can I use this variance in the meta-analysis? Thank you! [[alternative HTML version deleted]]