Dear users of metafor, I am working on a meta-analysis using the metafor package. I have a excel csv database that I am working with. I am interested in pooling the effect measures for a particular subgroup (European women) in this csv database. I am conducting both sub-group and meta-regression. In subgroup-analyses, I have stratified the database to create a separate csv file just for European women from the original database and conducted the following: women_west<-read.csv("women_west.csv") print(women_west) dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=women_west,append=TRUE) res<-rma(yi,vi,data=dat) is.factor(dat$year) forest(res,transf=transf.ztor) In meta-regression, I used the original database, but used categorical moderators for sex (=women), and ethnicity (=european) to find the effect specifically in European women. adult<-read.csv("adult.csv") print(adult) dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=adult,append=TRUE) res<-rma(yi,vi,data=dat) res<-rma(yi,vi,mods=cbind(sex,race),data=dat) predict(res,transf=transf.ztor,newmods=cbind(seq(from=0,to=1,by=1),1),addx=TRUE) I am getting different results between the forest function from subgroup analyses, and the predict function from the meta-regression. I thought they should have been the same - can I get help to explain why there are differences? In both cases, I am transforming raw Pearson coefficients to z-transformed coefficients, then back-transforming to raw r after pooling. Thank you very much. Jin Choi MSc (Epidemiology) Student McGill University, Montreal CANADA
At 18:03 05/05/2012, Jin Choi wrote:>Dear users of metafor, > >I am working on a meta-analysis using the metafor package. I have a >excel csv database that I am working with. I am interested in pooling >the effect measures for a particular subgroup (European women) in this >csv database. I am conducting both sub-group and meta-regression. > >In subgroup-analyses, I have stratified the database to create a >separate csv file just for European women from the original database >and conducted the following:Dear Jin There is a third option, using the original dataset and the subset parameter to metafor. What happens if you do that? It would rule out any possibility that your women_west dataset is not in fact the same as the data on European women in the adult dataset.>women_west<-read.csv("women_west.csv") >print(women_west) >dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=women_west,append=TRUE) >res<-rma(yi,vi,data=dat) >is.factor(dat$year) >forest(res,transf=transf.ztor) > >In meta-regression, I used the original database, but used categorical >moderators for sex (=women), and ethnicity (=european) to find the >effect specifically in European women. >adult<-read.csv("adult.csv") >print(adult) >dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=adult,append=TRUE) >res<-rma(yi,vi,data=dat) >res<-rma(yi,vi,mods=cbind(sex,race),data=dat) >predict(res,transf=transf.ztor,newmods=cbind(seq(from=0,to=1,by=1),1),addx=TRUE) > >I am getting different results between the forest function from >subgroup analyses, and the predict function from the meta-regression. >I thought they should have been the same - can I get help to explain >why there are differences? In both cases, I am transforming raw >Pearson coefficients to z-transformed coefficients, then >back-transforming to raw r after pooling. > >Thank you very much. > >Jin Choi >MSc (Epidemiology) Student >McGill University, Montreal CANADAMichael Dewey info at aghmed.fsnet.co.uk http://www.aghmed.fsnet.co.uk/home.html
Michael just provided a good suggestion, using the subset argument to make sure that you are really using the same data in both analyses. However, I would not expect the results to be exactly the same anyway. Remember that these are random/mixed-effects models you are using. So, when you use the meta-regression approach, you are estimating tau^2 differently than when using just the subset of data on European women. Essentially, the meta-regression approach assumes that the amount of heterogeneity is the same within each of the 4 possible combinations of sex and race (I am assuming that race is coded dichotomously). When you just use the subset of data on European women, you are estimating tau^2 just for that particular subgroup. Since the estimate of tau^2 is probably not the same with these two approaches, you will also get (slightly?) different results. Here is an example from one of the datasets that comes with the metafor package to illustrate this: ######################################################################## library(metafor) ### load empint data data(dat.empint) ### calculate r-to-z transformed correlations and corresponding sampling variances dat <- escalc(ri=ri, ni=ni, measure="ZCOR", data=dat.empint, append=TRUE) ### remove studies where struct is NA dat <- dat[!is.na(dat$struct ),] ### mixed-effects model with struct as moderator res <- rma(yi, vi, mods=~relevel(factor(struct), ref="u"), data=dat) res$tau2 predict(res, transf=transf.ztor, newmods=c(0,1), digits=2) ### two separate random-effects models within each level of struct res.u <- rma(yi, vi, data=dat, subset=struct=="u") res.s <- rma(yi, vi, data=dat, subset=struct=="s") res.u$tau2 predict(res.u, transf=transf.ztor, digits=2) res.s$tau2 predict(res.s, transf=transf.ztor, digits=2) ######################################################################## Note that the results are similar, but not quite the same. This is a result of the estimate of tau^2 being different in the two subgroups from the tau^2 estimated in the meta-regression model. To illustrate this further, try this out (which simply sets the estimate of tau^2 in the two subgroup models equal to the estimate of tau^2 from the mixed-effects model): ######################################################################## res.u <- rma(yi, vi, data=dat, subset=struct=="u", tau2=res$tau2) res.s <- rma(yi, vi, data=dat, subset=struct=="s", tau2=res$tau2) res.u$tau2 predict(res.u, transf=transf.ztor, digits=2) res.s$tau2 predict(res.s, transf=transf.ztor, digits=2) ######################################################################## Now you will find that the results are exactly the same with the meta-regression and the subgrouping approach. This may lead one to think that subgrouping is the preferred approach (since it does not assume that tau^2 is the same in the subgroups). However, the precision of the estimate of tau^2 tends to be determined largely by k (the number of studies/observations). Since the overall k is often not very large to begin with (the example used above is actually a rather large meta-analysis), subgrouping makes the k within each subgroup even smaller, leading to even less precise estimates of tau^2. Hopefully, the actual *conclusions* are not affected by the approach used. Best, Wolfgang -- Wolfgang Viechtbauer, Ph.D., Statistician Department of Psychiatry and Psychology School for Mental Health and Neuroscience Faculty of Health, Medicine, and Life Sciences Maastricht University, P.O. Box 616 (VIJV1) 6200 MD Maastricht, The Netherlands +31 (43) 388-4170 | http://www.wvbauer.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Jin Choi > Sent: Saturday, May 05, 2012 19:04 > To: r-help at r-project.org > Subject: [R] metafor > > Dear users of metafor, > > I am working on a meta-analysis using the metafor package. I have a excel > csv database that I am working with. I am interested in pooling the effect > measures for a particular subgroup (European women) in this csv database. > I am conducting both sub-group and meta-regression. > > In subgroup-analyses, I have stratified the database to create a separate > csv file just for European women from the original database and conducted > the following: > > women_west<-read.csv("women_west.csv") > print(women_west) > dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=women_west,append=TRUE) > res<-rma(yi,vi,data=dat) > is.factor(dat$year) > forest(res,transf=transf.ztor) > > In meta-regression, I used the original database, but used categorical > moderators for sex (=women), and ethnicity (=european) to find the effect > specifically in European women. > adult<-read.csv("adult.csv") > print(adult) > dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=adult,append=TRUE) > res<-rma(yi,vi,data=dat) > res<-rma(yi,vi,mods=cbind(sex,race),data=dat) > predict(res,transf=transf.ztor,newmods=cbind(seq(from=0,to=1,by=1),1),addx > =TRUE) > > I am getting different results between the forest function from subgroup > analyses, and the predict function from the meta-regression. > I thought they should have been the same - can I get help to explain why > there are differences? In both cases, I am transforming raw Pearson > coefficients to z-transformed coefficients, then back-transforming to raw > r after pooling. > > Thank you very much. > > Jin Choi > MSc (Epidemiology) Student > McGill University, Montreal CANADA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
I tried the subset argument as Michael suggested, which led to the same results. The results between meta-regression and subgroup analyses were only slightly different as Wolfgang had suggested. I also believe that these minor differences must be arising from the use of random effects. Thank you very much to both of you! Jin On Mon, May 7, 2012 at 5:14 AM, Viechtbauer Wolfgang (STAT) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:> Michael just provided a good suggestion, using the subset argument to make sure that you are really using the same data in both analyses. > > However, I would not expect the results to be exactly the same anyway. Remember that these are random/mixed-effects models you are using. So, when you use the meta-regression approach, you are estimating tau^2 differently than when using just the subset of data on European women. Essentially, the meta-regression approach assumes that the amount of heterogeneity is the same within each of the 4 possible combinations of sex and race (I am assuming that race is coded dichotomously). When you just use the subset of data on European women, you are estimating tau^2 just for that particular subgroup. Since the estimate of tau^2 is probably not the same with these two approaches, you will also get (slightly?) different results. > > Here is an example from one of the datasets that comes with the metafor package to illustrate this: > > ######################################################################## > > library(metafor) > > ### load empint data > data(dat.empint) > > ### calculate r-to-z transformed correlations and corresponding sampling variances > dat <- escalc(ri=ri, ni=ni, measure="ZCOR", data=dat.empint, append=TRUE) > > ### remove studies where struct is NA > dat <- dat[!is.na(dat$struct ),] > > ### mixed-effects model with struct as moderator > res <- rma(yi, vi, mods=~relevel(factor(struct), ref="u"), data=dat) > res$tau2 > predict(res, transf=transf.ztor, newmods=c(0,1), digits=2) > > ### two separate random-effects models within each level of struct > res.u <- rma(yi, vi, data=dat, subset=struct=="u") > res.s <- rma(yi, vi, data=dat, subset=struct=="s") > res.u$tau2 > predict(res.u, transf=transf.ztor, digits=2) > res.s$tau2 > predict(res.s, transf=transf.ztor, digits=2) > > ######################################################################## > > Note that the results are similar, but not quite the same. This is a result of the estimate of tau^2 being different in the two subgroups from the tau^2 estimated in the meta-regression model. To illustrate this further, try this out (which simply sets the estimate of tau^2 in the two subgroup models equal to the estimate of tau^2 from the mixed-effects model): > > ######################################################################## > > res.u <- rma(yi, vi, data=dat, subset=struct=="u", tau2=res$tau2) > res.s <- rma(yi, vi, data=dat, subset=struct=="s", tau2=res$tau2) > res.u$tau2 > predict(res.u, transf=transf.ztor, digits=2) > res.s$tau2 > predict(res.s, transf=transf.ztor, digits=2) > > ######################################################################## > > Now you will find that the results are exactly the same with the meta-regression and the subgrouping approach. This may lead one to think that subgrouping is the preferred approach (since it does not assume that tau^2 is the same in the subgroups). However, the precision of the estimate of tau^2 tends to be determined largely by k (the number of studies/observations). Since the overall k is often not very large to begin with (the example used above is actually a rather large meta-analysis), subgrouping makes the k within each subgroup even smaller, leading to even less precise estimates of tau^2. Hopefully, the actual *conclusions* are not affected by the approach used. > > Best, > > Wolfgang > > -- > Wolfgang Viechtbauer, Ph.D., Statistician > Department of Psychiatry and Psychology > School for Mental Health and Neuroscience > Faculty of Health, Medicine, and Life Sciences > Maastricht University, P.O. Box 616 (VIJV1) > 6200 MD Maastricht, The Netherlands > +31 (43) 388-4170?| http://www.wvbauer.com > > >> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] >> On Behalf Of Jin Choi >> Sent: Saturday, May 05, 2012 19:04 >> To: r-help at r-project.org >> Subject: [R] metafor >> >> Dear users of metafor, >> >> I am working on a meta-analysis using the metafor package. I have a excel >> csv database that I am working with. I am interested in pooling the effect >> measures for a particular subgroup (European women) in this csv database. >> I am conducting both sub-group and meta-regression. >> >> In subgroup-analyses, I have stratified the database to create a separate >> csv file just for European women from the original database and conducted >> the following: >> >> women_west<-read.csv("women_west.csv") >> print(women_west) >> dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=women_west,append=TRUE) >> res<-rma(yi,vi,data=dat) >> is.factor(dat$year) >> forest(res,transf=transf.ztor) >> >> In meta-regression, I used the original database, but used categorical >> moderators for sex (=women), and ethnicity (=european) to find the effect >> specifically in European women. >> adult<-read.csv("adult.csv") >> print(adult) >> dat<-escalc(measure="ZCOR",ri=Pearson,ni=N,data=adult,append=TRUE) >> res<-rma(yi,vi,data=dat) >> res<-rma(yi,vi,mods=cbind(sex,race),data=dat) >> predict(res,transf=transf.ztor,newmods=cbind(seq(from=0,to=1,by=1),1),addx >> =TRUE) >> >> I am getting different results between the forest function from subgroup >> analyses, and the predict function from the meta-regression. >> I thought they should have been the same - can I get help to explain why >> there are differences? In both cases, I am transforming raw Pearson >> coefficients to z-transformed coefficients, then back-transforming to raw >> r after pooling. >> >> Thank you very much. >> >> Jin Choi >> MSc (Epidemiology) Student >> McGill University, Montreal CANADA >> >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting- >> guide.html >> and provide commented, minimal, self-contained, reproducible code.
Possibly Parallel Threads
- z to r transformation within print.rma.uni and forest from the package metafor
- Error when adding lines to a plot using the mixed-effect model and metafor package
- What to use for ti in back-transforming summary statistics from F-T double square-root transformation in 'metafor'
- tweaking forest plot (metafor package)
- Classic fail-safe N