thr3ads.net - R help - [R] (Meta-analysis) How to build|fake a [n]lm[e] object ? [Dec 2001]

If this information is useful, please help other people find it:
Share via:

Emmanuel Charpentier

2001-Dec-05 12:47 UTC

[R] (Meta-analysis) How to build|fake a [n]lm[e] object ?

Dear all,

I recently had to review the current litterature about some medical treatment
with two possible variants (let's call them A and B). I collected all
available
prospective randomized trials about this treatment : I got four trials for the
A variant and three for the B variant, all studies comparing one variant to a
"suitably choosen" placebo.

Two classes of variables are of interest here :
	a) the net effect of the treatment, which is assessed by some (set of)
numerical
	   values, with distributions not too far from the normal ;
	b) the side effects of the treatment, assessed by the number of occurences of
	   (a set of) undesirable events.

The papers report :
	a) for the numerical variables : sample size, mean and SD (or SE, which allows
to 	   recompute SD) of each group, plus some test statistic (usually
Student's
T) ;
	b) for events : the sample size and number of events in each group, plus some
	   test statistic (usually chi-square, sometimes incorrectly used : the
	   continuity correction is often forgotten, an the exact Fisher test is
almost
	   unheard of ...).

It made medical sense to consider the "variant" factor ancillary to
the
treatment factor (that is, to *postulate* that the difference in treatment
effects between variant is much smaller that the treatment effect itself);
therefore, it is not a big problem to exclude it in the analysis. So I used the
rmeta package to assess the treatment effects. The results, as far as I can
tell, are not unreasonable.

However, I have two problems with this approach :

A) Assessing the "variant" effect : how ?
========================================
My main problem is that I can't assess formally the (quite possibly null)
effect of the "variant" factor (i. e. checking, at least a posteriori,
that the
"variant" effect is indeed much smaller that the treatment effect). In
other
words, if I had had the trials' raw data, what I would have used would have
been, for numerical variables, something along the lines of :

meta.lme<-lme(Variable~Treatment*Variant/Trial, data=xxx, random=~1|Trial)

for a "random trial effect" (? la Der Simonian), and

meta.lm<-lm(Variable~Treatment*Variant/Trial, data=xxx)

for a "fixed trial effect" model, "treatment" and
"variant" being of course
fixed effects of interest, the Treatment*Variant interaction being the variable
of interest for the verification of the homogeneity of treatment effect between
variants. (In my case, the trials are somewhat heterogenous (due tio not having
the same inclusion criteria), therefore the "random effect" model
makes more
sense).

However, I do *not* have the raw data. Of course, I can trivially rebuild the
"sum-of-data" and "sum-of-squares" in each "cell"
of the potential
"experimental plan". But I'm not able to analyse this. I looked in
old books
(some dating back from the '50s, wher computers were not readily available
for
biostatistics) and saw that all algorithms used back then supposed a *balanced*
experimental plan. Some approximations were used (such as using the harmonic
means of sample sizes to compute the expectations of "between-rows",
"between-columns", "between-cells" and
"within-cells" variances under the null
hypothesis, but those approximations can only be used for *mild* unbalances. In
my case, this won't do : Per-group sample size varies between 10 and 244,
and
there is always some unbalance between treatment groups (mainly due to
stratification effects). That's *not* "mild" ...

I tried to follow Winer's explanation of what he calls "least-squares
estimation" (that's what all modern ANOVA software, including lm and
friends,
do) to see if I could build an algorithm from this ... and got lost (I'm
pretty
bad at linear algebra).

However, it appears that a lm object contains just the kind of data one can
extract from a pile of papers : one can build such an object with each group of
each paper a line, with a "residual" computed from the published SD, a
"value"
computed from the published mean and a "weight" computed from te
sample size.
Given that drop, anova and related functions do not have to re-fit the model to
assess effects, one could then analyse this artificially-reconstructed lm
object.

Hence my questions :
	a) Am I totally wrong ?
	b) If not, how would you build such an object ?
	c) What cautions should be used in interpreting the results ?
	d) Would this approach work with a lme object ? with a (suitably built) nlme
	   object (in order to assess "variant" effect on event data) ?
	e) Would such an approach allow to assess treatment effects for trials with
more
	   than 2 groups (e. g. placebo vs. drug vs. surgery) ?

B) Alternatives to the odds-ration for event data ?
==================================================
The usual way to assess effects for categorical variables is to compute the
log(odds-ratio) for each study and to pool them using inverse variance as
weights (that's what meta.DSL and meta.MH do, respectively for random and
fixed
effect model).

However, in some trials, some event have a frequency of zero in one or both
groups. In the first case, one can neglect the said trial for the assessment of
the treatment effect, on the basis that it is not informative. In the second
case, however, the data cannot be used (because the OR is either zero or
infinite, with infine asymptotic variance). The treatment assessment by OR
pooling dismisses these trials (see meta.DSL source, for example ; and this is
also the case in other meta-analysis packages, such as Cochrane's RevMan).

But the asymetry (some events in one group and none in the other) is indeed an
information, and I do not feel at ease with discarding it. The best I can think
of is the ordinary test of independance (Fisher's test, in this case) on a
contingency table "summing" the individual trials' contingency
tables. This
analysis confirms the results iof the meta-analysis. But it does not account
for trials' heterogeneity, which is a large part of the point of a
meta-analysis.

Someone suggested to me to add a "small" quantity (say 1, or 0.5, as
in the
case of Yate's correction for continuity) to the event counts in these
groups,
ant to see if the inclusion of these study would entail a modification of the
results, but I'm "isntinctively" not satisfied with this approach.

In my case, the meta-analysis exhibits an excess of some undesirable events in
one of the treatment groups, while this excess does not reach the sacro-sanctus
"statistical significance threshold" in any of the papers I analysed
(physicians are sometimes bloody p-value worshippers ...). Therefore, I'd
like
to be damn sure to *correctly* use *all* available information.

Any suggestions or pointers to litterature ?

Sincerely yours,

						Emmanuel Charpentier

--
Emmanuel Charpentier			Tel :		+33-01 40 27 35 98
Secr?tariat scientifique du CEDIT	Fax :		+33-01 40 27 55 65
Direction de la Politique M?dicale // Assistance Publique - H?pitaux de Paris
3, Avenue Victoria // F-75004 Paris /// France

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

2001-Dec-05 17:11 UTC

head link

[R] (Meta-analysis) How to build|fake a [n]lm[e] object ?

On Wed, 5 Dec 2001, Emmanuel Charpentier wrote:
>
> B) Alternatives to the odds-ration for event data ?
> ==================================================>
> The usual way to assess effects for categorical variables is to compute the
> log(odds-ratio) for each study and to pool them using inverse variance as
> weights (that's what meta.DSL and meta.MH do, respectively for random
and fixed
> effect model).
>
> However, in some trials, some event have a frequency of zero in one or both
> groups. In the first case, one can neglect the said trial for the
assessment of
> the treatment effect, on the basis that it is not informative. In the
second
> case, however, the data cannot be used (because the OR is either zero or
> infinite, with infine asymptotic variance). The treatment assessment by OR
> pooling dismisses these trials (see meta.DSL source, for example ; and this
is
> also the case in other meta-analysis packages, such as Cochrane's
RevMan).
meta.MH doesn't have this problem -- it's quite happy with zero cells.
> But the asymetry (some events in one group and none in the other) is indeed
an
> information, and I do not feel at ease with discarding it. The best I can
think
> of is the ordinary test of independance (Fisher's test, in this case)
on a
> contingency table "summing" the individual trials'
contingency tables. This
> analysis confirms the results iof the meta-analysis. But it does not
account
> for trials' heterogeneity, which is a large part of the point of a
> meta-analysis.
Either meta.MH or conditional logistic regression (clogit in the
survival package) would fix this
> Someone suggested to me to add a "small" quantity (say 1, or 0.5,
as in the
> case of Yate's correction for continuity) to the event counts in these
groups,
> ant to see if the inclusion of these study would entail a modification of
the
> results, but I'm "isntinctively" not satisfied with this
approach.
>
If you want a fixed effect of treatment there's no problem (and I
personally don't like meta-analyses where a random-effects model makes a
difference)

If you need a random effects model that doesn't object to zero cells then
lme() and variants aren't going to work, and you need a real generalized
linear mixed model with random intercept and random treatment effect.
Logistic mixed models are a hard problem.  Jim Lindsey's 'repeated'
package may handle this, though.

A little simulation would tell you what the properties of the `continuity
correction' approach are.


	-thomas

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

friendly@hotspur.psych.yorku.ca

2001-Dec-05 17:38 UTC

head link

[R] (Meta-analysis) How to build|fake a [n]lm[e] object ?

Emmanuel-

Perhaps I can help with one thing:

! However, I do *not* have the raw data. Of course, I can trivially rebuild  
the
! "sum-of-data" and "sum-of-squares" in each
"cell" of the potential
! "experimental plan". But I'm not able to analyse this. I looked
in old
books
! (some dating back from the '50s, wher computers were not readily available
for
! biostatistics) and saw that all algorithms used back then supposed a  
*balanced*

There is a simple solution to the problem of going from summary statistics
to an lm() analysis which gives equivalent results, described by Larsen,
and implemented by me as a SAS macro, stat2dat.  The freq= variable
would become the weight= in lm().

/*= 

 name: STAT2DAT
title: Transform a summary data set to pseudo-observations
  Doc: math.yorku.ca/SCS/sasmac/stat2dat.html
Version: 1.1
Revised: 2 Apr 1999 


=Description:

Take a dataset containing summary statistics (N, mean, std dev) for
a between groups design and produce a dataset from which PROC GLM
can be run to produce equivalent results.

=Usage:
   %stat2dat(data=inputdataset, out=outputdataset, ..., 

      depvar=Y, freq=freq)

      The input dataset contains one observation for each group.
      Supply the names of variables containing the N, MEAN, and standard
      deviation (STD) for each group (see argument list below);  The
      mean square error (MSE) for a reported ANOVA can be supplied instead
      of individual STD values.  The sample size per cell can be supplied
      as a constant rather than a dataset variable if all groups are of the
      same size.  


      The output dataset can then be used with PROC GLM or PROC ANOVA
      (balanced designs).  It contains all variables from the input dataset
      plus a constructed dependent variable ('Y' by default) and
      a constructed frequency variable ('freq' by default).
      

   proc glm data=outputdataset;
      class classvars;
      freq freq;
      model Y = modelterms;
      

Based on:  David Larsen, Analysis of Variance With Just Summary Statistics
   as Input,  The American Statistician, May 1992, Vol. 46(2), 151-152.
   (David Larson:   dalef at uno.edu)


Michael Friendly	<friendly at yorku.ca>
Psychology Department, York University
Toronto, ONT  M3J 1P3 CANADA
=*/

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Possibly Parallel Threads

Search for more reasonably related threads

R help - Dec 2001 - (Meta-analysis) How to build|fake a [n]lm[e] object ?

[R] (Meta-analysis) How to build|fake a [n]lm[e] object ?

[R] (Meta-analysis) How to build|fake a [n]lm[e] object ?

[R] (Meta-analysis) How to build|fake a [n]lm[e] object ?

Possibly Parallel Threads