thr3ads.net - R help - [R] Summarising Data for Forrest Plots [Jul 2009]

If this information is useful, please help other people find it:
Share via:

Polwart Calum (County Durham and Darlington NHS Foundation Trust)

2009-Jul-28 18:28 UTC

[R] Summarising Data for Forrest Plots

I tried to post this a few times last week and it seems to have got stuck
somehow so I'm trying from a different email in the hope that works. If
somehow this has appeared on the list 20 tiems and I never saw any of them I
apologize ;-)

I'm basically an R-newbie. But I am VERY computer literate. But this has
me stumped...

All the examples for using the rmeta package to create a forest plot or simillar
seem to use the catheter data:

Name n.trt n.ctrl col.trt col.ctrl inf.trt inf.ctrl
1 Ciresi 124 127 15 21 13 14
2 George 44 35 10 25 1 3
3 Hannan 68 60 22 22 5 7
4 Heard 151 157 60 82 5 6
...

As I see it thats a summary of data from several published trials.

What I want to do is do a forrest (forest) plot for subgroups within my single
dataset as a test of heterogeniety. I have a dataset who received either full
dose(FD) or reduced dose(RD) treatment, and a number of characteristics about
those subjects: age, sex, renal function, weight, toxicity. And I have survival
data (censored). they are in standard columnar data.

Is there an *easy* way to transform them into something like this:

SubGroup n.FD n.RD surv.FD surv.RD
1 Age >65
2 Age <= 65
3 Male
...
9 Grade 0-2 Tox
10 Grade 3/4 Tox

Which rmeta will then let me use to create a forest plot from? This is a
reasonably standard approach in biomedical studies these days so it seems odd
that I can't find any "How-To" that tells me how to short cut it.
Otherwise I have to manually calculate each of the parameters :-( Which is a
real pain as we are awaiting more mature data which would need the same process
re-run.

Thanks in advance

********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}

Viechtbauer Wolfgang (STAT)

2009-Jul-29 07:53 UTC

head link

[R] Summarising Data for Forrest Plots

Are n.FD and n.RD the number of people who received the full/reduced dose and
surv.FD and surv.RD the number of people that survived? And are the people who
received the full dose different from the people who received the reduced dose?
And what exactly is it that you want to plot in the forest plot? From the way
you have arranged the table, it seems as if you want some kind of effect size
measure that contrasts the survival rate of the full versus reduced dose in the
various subgroups. Is that correct? And are you just trying to figure out how to
draw the forest plot once you have the data in the table form as shown in your
post or are you also trying to figure out how to create that table to begin
with?

--
Wolfgang Viechtbauer
 Department of Methodology and Statistics
 University of Maastricht, The Netherlands
 http://www.wvbauer.com/



----Original Message----
From: r-help-bounces at r-project.org
[mailto:r-help-bounces at r-project.org] On Behalf Of Polwart Calum
(County Durham and Darlington NHS Foundation Trust) Sent: Tuesday, July
28, 2009 20:28 To: r-help at r-project.org Subject: [R] Summarising Data
for Forrest Plots
> I tried to post this a few times last week and it seems to have got
> stuck somehow so I'm trying from a different email in the hope that
> works.   If somehow this has appeared on the list 20 tiems and I
> never saw any of them I apologize ;-)
>
> I'm basically an R-newbie.  But I am VERY computer literate.  But
> this has me stumped...
>
> All the examples for using the rmeta package to create a forest plot
> or simillar seem to use the catheter data:
>
>          Name n.trt n.ctrl col.trt col.ctrl inf.trt inf.ctrl
> 1      Ciresi   124    127      15       21      13       14
> 2      George    44     35      10       25       1        3
> 3      Hannan    68     60      22       22       5        7
> 4       Heard   151    157      60       82       5        6
> ...
>
> As I see it thats a summary of data from several published trials.
>
> What I want to do is do a forrest (forest) plot for subgroups within
> my single dataset as a test of heterogeniety. I have a dataset who
> received either full dose(FD) or reduced dose(RD) treatment, and a
> number of characteristics about those subjects: age, sex, renal
> function, weight, toxicity.  And I have survival data (censored).
> they are in standard columnar data.
>
> Is there an *easy* way to transform them into something like this:
>
>         SubGroup        n.FD    n.RD    surv.FD         surv.RD
> 1       Age >65
> 2       Age <= 65
> 3       Male
> ...
> 9       Grade 0-2 Tox
> 10      Grade 3/4 Tox
>
> Which rmeta will then let me use to create a forest plot from?  This
> is a reasonably standard approach in biomedical studies these days so
> it seems odd that I can't find any "How-To" that tells me how
to
> short cut it.  Otherwise I have to manually calculate each of the
> parameters :-(  Which is a real pain as we are awaiting more mature
> data which would need the same process re-run.
>
> Thanks in advance
>
> C
>
>
********************************************************************************************************************
>
> This message may contain confidential information. If
> yo...{{dropped:21}}
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Polwart Calum (County Durham and Darlington NHS Foundation Trust)

2009-Jul-29 16:44 UTC

head link

[R] Summarising Data for Forrest Plots

> Are n.FD and n.RD the number of people who received the full/reduced doseYes - but I don't have the data structured like that YET - thats what I want
to get to because thats what forest plot seems to be wanting.
> and surv.FD and surv.RD the number of people that survived?Mmm... was more thinking of something like median survival?  ALthough the brain
hasn't kicked into gear yet tonight and it might actually be mean to be a
hazard ratio?
>And are the people who received the full dose different from the people who
received the reduced dose?Yes
> And what exactly is it that you want to plot in the forest plot?Subgroups - see below
>From the way you have arranged the table, it seems as if you want some kind
of effect size measure that contrasts the survival rate of the full versus
reduced dose in the various subgroups. Is that correct?Yip that sounds right
>And are you just trying to figure out how to draw the forest plot once you
have the data in the table form as shown in your post or are you also trying to
figure out how to create that table to begin with?I *think* I can draw the plot once I have the data structured right.  But at the
moment my data is structured like this:

PatientID  FullDose   Survival  Censored  Age     Sex   Normal Renal Func  
Grade of Toxicity

001          Y                125         N               75       F       Y    
1
002          N                55           Y               55       M      N    
4
003          N                65          Y                78       F       Y   
2

I want to eventually get to a forest plot that looks a bit like this:

Age:
< 65                           
|-------------#---------------|----|>= 65         |-------------#----------------|               |                                                                           |
Sex:                                                                    |
M                                                            |-----#--|---|
F      |---------------#---------------------|               |
                                                                           |
Renal Fucn:                                                       |
Normal                       |---------------#-------------|
Abnormal                   |---------------#-------------|
                                                                           |
Grade of Toxicity:                                              |
0-1                                                                     | 
|-------#-------|
2                                      |-----#------|                |
3-4          |----------#------------|                          |
                                                                          |
Overall:                                          <>               |


Which I believe I can achieve using the metaplot or forrest plot functions,
replacing the studies with the relevant sub groups.  But my challenge has been
converting the patient data above down to list subgroups.  Other than by running
a survival analysis individually on an individual subgroup recording the results
and building up a table.

Calum

********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}

Polwart Calum (County Durham and Darlington NHS Foundation Trust)

2009-Jul-29 17:06 UTC

head link

[R] Summarising Data for Forrest Plots

>> What I want to do is do a forrest (forest) plot for subgroups within my
single dataset as a test of heterogeniety. I have a dataset who received either
full dose(FD) or reduced dose(RD) treatment, and a number of characteristics
about those subjects: age, sex, renal function, weight, toxicity.  And I have
survival data (censored).  they are in standard columnar data.
>>
> Is there an *easy* way to transform them into something like this:
>>
>>         SubGroup        n.FD    n.RD    surv.FD         surv.RD
>> 1       Age >65
>> 2       Age <= 65
>> 3       Male
>> ...
>> 9       Grade 0-2 Tox
>> 10      Grade 3/4 Tox
>>
>Hi Calum,
>Have you tried subsetting the dataset like this:
>
>meta.DSL(...,data=mydataset[mydataset$age <= 65,],...)
>
>Jim
Hi Jim,

I'm not sure that I understand!  But my understanding was that meta.DSL
wants 4 bits of information number treated (Full Dose in my case), Number in
control (reduced dose in my case), Number of events in the twoi groups... which
is what I was trying to describe above - although possibly not very well..

Then it will do the work for me.

My challenge is taking a load of data in columns and getting it summarised by
the subgroups so that it takes Age > 65 and counts how many had full dose,
howmany had reduced dose and populates the field then does the same for Age <
65 etc etc...  (I may be back with questions about the survival value - but even
knowing how to get it to summarise like I describe would be a start.  I guess
its a bit like a pivot table in excel?


But perhaps its something to do with the mydataset[mydataset$age <=65,] bit? 
That seems to give me a data table with only the 65 and unders which makes
sense.  But then how do I get it to populate a table with the numbers in the two
groups?

********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}

Polwart Calum (County Durham and Darlington NHS Foundation Trust)

2009-Jul-30 20:43 UTC

head link

[R] Summarising Data for Forrest Plots

>Ah, I think I see what you want. Try this on each pair of exclusive sets:
>n_total<-dim(mydataset)[1]
under65<-mydataset$age <= 65
n_under65<-sum(under65)
under65row<-c(sum(mydataset$dose[under65] == "FD"),
 sum(mydataset$dose[under65] == "RD"),
 sum(mydataset$vitalstatus[under65] == "dead" &
  mydataset$dose[under65] == "FD"),
 sum(mydataset$vitalstatus[under65] == "dead" &
  mydataset$dose[under65] == "RD"))
over65row<-c(sum(mydataset$dose[!under65] == "FD"),
 sum(mydataset$dose[!under65] == "RD"),
 sum(mydataset$vitalstatus[!under65] == "dead" &
  mydataset$dose[!under65] == "FD"),
 sum(mydataset$vitalstatus[!under65] == "dead" &
  mydataset$dose[!under65] == "RD"))>
>Then under65row and over65row should be the first two rows of your result.
>Can't test this at the moment, but I don't think it's too far
wrong.
Thanks Jim.

Yes it looks like that code should do the job.  I was really hoping for a code
like "SummariseForSubsetAnalysis(mydataset, by=mydatatset$dose,
subsets=c(age, renal, sex, toxicity), event=survival )" which would
magically do it for me ;-)

I guess if this is something I start having to do lots I might have to write
one.

Surprised one doesn't seem to exist - perhaps the number of variations in
what people want would be too complex.

Calum


********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}

Polwart Calum (County Durham and Darlington NHS Foundation Trust)

2009-Jul-30 22:35 UTC

head link

[R] Summarising Data for Forrest Plots

>Ah, I think I see what you want. Try this on each pair of exclusive sets:<snip>
>Then under65row and over65row should be the first two rows of your result.
>Can't test this at the moment, but I don't think it's too far
wrong.
>I knew this shouldn't need so much work ;-)

Not cracked it yet - because as I see it I need a 2 x 4 table and at the moment
I only cracked a 2 x 2 table.  ( Or really I need something like a 10 x 4 - but
the 4 is the bit that I haven't cracked)

First option is something like this:
with(mydataset,  table(Sex, Dose))

I can get:
        Dose
Sex  FD RD
  F    6    15
  M  16    23

For non catagorical data its slightly trickier... but quite achievable in two
lines (for the 2 x 2 table)

factor(cut(mydatasetl$Age, breaks = c(0,65,100))) -> AgeBands
table (AgeBands, mydataset$Dose)

Which gives:

AgeBands  FD RD
 (0,65]        15    6
 (65,100]    13    26

Although - I'm not yet sure if I can actually call that data back by column
names.  ie

x <- table (AgeBands, mydataset$Dose)
x$FD

produces an error. :-(

But getting there.














********************************************************************************************************************

This message may contain confidential information. If yo...{{dropped:21}}

Reasonably Related Threads

Search for more seemingly similar threads

R help - Jul 2009 - Summarising Data for Forrest Plots

[R] Summarising Data for Forrest Plots

[R] Summarising Data for Forrest Plots

[R] Summarising Data for Forrest Plots

[R] Summarising Data for Forrest Plots

[R] Summarising Data for Forrest Plots

[R] Summarising Data for Forrest Plots

Reasonably Related Threads