For any given pre-specified gene or short list of genes, yes the Cox model works fine. Two important caveats: 1. Remeber the rule of thumb for a Cox model of 20 events per variable (not n=20). Many microarray studies will have very marginal sample size. 2. If you are looking at many genes then a completely different strategy is required. There is a large and growing literature; I like Newton et al, Annals of Applied Statistictis, 2007, 85-106 as an intro; but expect to read much more. Terry Therneau -------- begin included message --------- I want to test the expression of a subset of genes for correlation with patient survival. I found out that the coxph function is appropriate for doing this since it works with continuous variables such as Affy mRNA expression values. I applied the following code: cp <- coxph(Surv(t.rfs, !e.rfs) ~ ex, pData(eset.n0)) #t.rfs: time to relapse, status (0=alive,1=dead), ex: expression value (continuous) The results I get look sensible but I would appreciate any advice on the correctness and also any suggestions for any (better) alternative methods. Best wishes
Dear Terry thank you very much for this. The number of events in my data is well above the suggested size. What concerns me is the fact that a very high proportion of probes comes up as significant when I just randomly select them. This just seems not be biologically meaningful. Do you know of a method that allows you to split your samples into say low-expressors and high-expressors? Something like regarding samples with an expression of the probe lower than the median-expression minus an constant as low-expressors and those with expression higher than median expression plus the constant as high-expressors. Do you have any ideas on this? Best wishes Kristian On 07/01/11 20:33, Terry Therneau wrote:> For any given pre-specified gene or short list of genes, yes the Cox > model works fine. Two important caveats: > > 1. Remeber the rule of thumb for a Cox model of 20 events per variable > (not n=20). Many microarray studies will have very marginal sample > size. > > 2. If you are looking at many genes then a completely different strategy > is required. There is a large and growing literature; I like Newton et > al, Annals of Applied Statistictis, 2007, 85-106 as an intro; but expect > to read much more. > > Terry Therneau > > -------- begin included message --------- > > I want to test the expression of a subset of genes for correlation with > patient survival. I found out that the coxph function is appropriate > for > doing this since it works with continuous variables such as Affy mRNA > expression values. > > I applied the following code: > > cp<- coxph(Surv(t.rfs, !e.rfs) ~ ex, pData(eset.n0)) #t.rfs: time to > relapse, status (0=alive,1=dead), ex: expression value (continuous) > > The results I get look sensible but I would appreciate any advice on > the > correctness and also any suggestions for any (better) alternative > methods. > > Best wishes > > >[[alternative HTML version deleted]]
On Jan 7, 2011, at 3:33 PM, Terry Therneau wrote:> For any given pre-specified gene or short list of genes, yes the Cox > model works fine. Two important caveats: > > 1. Remeber the rule of thumb for a Cox model of 20 events per variable > (not n=20). Many microarray studies will have very marginal sample > size. > > 2. If you are looking at many genes then a completely different > strategy > is required. There is a large and growing literature; I like Newton > et > al, Annals of Applied Statistictis, 2007, 85-106 as an intro; but > expect > to read much more.Trying my university library first without success, a Google search then returned this: http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoas/1183143730 Open Access aricle and an associated R package, allez. Life is good! -- David.> > Terry Therneau > > -------- begin included message --------- > > I want to test the expression of a subset of genes for correlation > with > patient survival. I found out that the coxph function is appropriate > for > doing this since it works with continuous variables such as Affy mRNA > expression values. > > I applied the following code: > > cp <- coxph(Surv(t.rfs, !e.rfs) ~ ex, pData(eset.n0)) #t.rfs: time to > relapse, status (0=alive,1=dead), ex: expression value (continuous) > > The results I get look sensible but I would appreciate any advice on > the > correctness and also any suggestions for any (better) alternative > methods. > > Best wishes > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT