Julien Riou
2014-May-20 10:05 UTC
[R] Variance of predictions from lme4 models at existing levels?
Dear contributors,
I know that this has been widely discussed, but even after having read many
discussions on this matter, I'm still not sure if I'm understanding
properly.
So I have a dataset of studies reporting prevalence in several settings,
here is an exemple:
data<-data.frame(id_study=c("UK1","UK2","UK3","FRA1","FRA2","BEL1","GER1","GER2","GER3"),
country=c("UK","UK","UK","FRANCE","FRANCE","BELGIUM","GERMANY","GERMANY","GERMANY"),
n_events=c(130,49,18,10,50,6,34,20,51),
n_total=c(3041,580,252,480,887,256,400,206,300),
study_median_age=c(25,50,58,30,42,26,27,28,36),
pop_type=c(2,0,1,0,0,0,1,0,2))
data$prevalence<-data$n_events/data$n_total
Since the prevalence is linked to the sample age and type, the objective is
to estimate the predicted prevalence for each study for a sample with
standardized median age (30) and type ("A"). Then these predictions
will be
pooled by country using a Dersimonian & Laird method.
This is the model I chose:
mod<-glmer(cbind(n_events, n_total-n_events) ~ study_median_age + pop_type +
(1|country) + (1|id_study), data=data,
family="binomial"(link=logit))
I tried to adapt the method from the r-sig-mixed-models
FAQ<http://glmm.wikidot.com/faq> in
order to get the variance of the estimate, which will be used in the
weighting of the Dersimonian & Laird meta-analysis:
newdat=data.frame(id_study=data$id_study,
country=data$country,
study_median_age=30,
pop_type="A",
n_events=0,
n_total=data$n_total)
pred<-predict(mod,newdat[,1:4],type="link",REform=~(1|country) +
(1|id_study))
mm <- model.matrix(terms(mod),newdat)
newdat$distance <- mm %*% fixef(mod)
pvar1 <- diag(mm %*% tcrossprod(vcov(mod),mm))
tvar1 <- pvar1 + VarCorr(mod)$country[1] + VarCorr(mod)$id_study[1]
If I understood correctly, tvar1 is an estimation of the variance of the
prediction, taking into account the uncertainty on the two random
intercepts. I can consider tvar1 as the variance of the prediction for the
further meta-analysis, and also for calculation of the confidence interval
of the prediction.
data$prediction<-mod@resp$family$linkinv(pred)
data$predictionSE2<-tvar1
data$predictionCI<-paste0("[",round(mod@resp$family$linkinv(pred-1.96*tvar1),4),";",
round(mod@resp$family$linkinv(pred+1.96*tvar1),4),"]")
data
So my questions are:
- Did I properly adapt the method for the calculation of the variance to
this model with 2 random intercepts?
- Can I use this variance in the meta-analysis?
Thank you!
[[alternative HTML version deleted]]