Dear R-Helpers, I am a novice in survival analysis. I have the following code: for (i in 3:12) print(coxph(Surv(time, status)~a[,i], data=a)) I used it to fit the Cox Proportional Hazard models separately for every available parameter (columns 3:12) in my data set - with intention to compare the Hazard Ratios. However, some of my variables are in range 0.1 to 1.6, others in range 5000 to 9000. How do I compare HRs between such variables? I have rescaled all the variables to be in 0 to 1 range - is this the proper way to go? Is there a way to somehow calculate the same HRs (as for rescaled parameters) from the HRs for original parameters? Many thanks in advance. -- Michal J. Figurski, PhD HUP, Pathology & Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney Philadelphia, PA 19104 tel. (215) 662-3413
Michal Figurski wrote:> Dear R-Helpers, > > I am a novice in survival analysis. I have the following code: > for (i in 3:12) print(coxph(Surv(time, status)~a[,i], data=a)) > > I used it to fit the Cox Proportional Hazard models separately for every > available parameter (columns 3:12) in my data set - with intention to > compare the Hazard Ratios. > > However, some of my variables are in range 0.1 to 1.6, others in range > 5000 to 9000. How do I compare HRs between such variables? > > I have rescaled all the variables to be in 0 to 1 range - is this the > proper way to go? Is there a way to somehow calculate the same HRs (as > for rescaled parameters) from the HRs for original parameters? > > Many thanks in advance. >There are a lot of issues related to this that will require a good bit of study, both in survival analysis and in regression. I would start with bootstrapping the ranks of the likelihood ratio chi-square statistics of the competing biomarkers. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University
On Mar 30, 2010, at 3:45 PM, Michal Figurski wrote:> Dear R-Helpers, > > I am a novice in survival analysis. I have the following code: > for (i in 3:12) print(coxph(Surv(time, status)~a[,i], data=a)) > > I used it to fit the Cox Proportional Hazard models separately for > every available parameter (columns 3:12) in my data set - with > intention to compare the Hazard Ratios.Of dubious statistical validity at least for modest sample sizes. You should try that method with randomly generated data and see what you get.> > However, some of my variables are in range 0.1 to 1.6, others in > range 5000 to 9000. How do I compare HRs between such variables? > > I have rescaled all the variables to be in 0 to 1 range - is this > the proper way to go?Seems doubtful. Scaling by the range will let the outliers dominate the scaling.> Is there a way to somehow calculate the same HRs (as for rescaled > parameters) from the HRs for original parameters?You could do a lot better by following Frank Harrell's example and use the difference between the 25th and 75th percentiles as a common scaling strategy. His anova function provides this as the default. You are then comparing cases at the boundaries of the upper end of the lowest quartile with those at the lower end of the upper quartile. No assumptions of normality need be made and you are much less subject to the erratic sampling properties of the zeroth and 100th percentiles. -- David Winsemius, MD> > Many thanks in advance. > > -- > Michal J. Figurski, PhD > HUP, Pathology & Laboratory Medicine > Biomarker Research Laboratory > 3400 Spruce St. 7 Maloney > Philadelphia, PA 19104 > tel. (215) 662-3413 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
Thank you, gentlemen. I greatly appreciate your help. -- Michal J. Figurski, PhD HUP, Pathology & Laboratory Medicine Biomarker Research Laboratory 3400 Spruce St. 7 Maloney Philadelphia, PA 19104 tel. (215) 662-3413
Frank, Is there an article that discusses this idea of bootstrapping the ranks of the likelihood ratio chi-square Statistics to assess relative importance of predictors in time-to-event data (specifically Cox PH model)? Thanks, Ravi. -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Frank E Harrell Jr Sent: Tuesday, March 30, 2010 3:57 PM To: Michal Figurski Cc: r-help at r-project.org Subject: Re: [R] Problem comparing hazard ratios Michal Figurski wrote:> Dear R-Helpers, > > I am a novice in survival analysis. I have the following code: > for (i in 3:12) print(coxph(Surv(time, status)~a[,i], data=a)) > > I used it to fit the Cox Proportional Hazard models separately for every > available parameter (columns 3:12) in my data set - with intention to > compare the Hazard Ratios. > > However, some of my variables are in range 0.1 to 1.6, others in range > 5000 to 9000. How do I compare HRs between such variables? > > I have rescaled all the variables to be in 0 to 1 range - is this the > proper way to go? Is there a way to somehow calculate the same HRs (as > for rescaled parameters) from the HRs for original parameters? > > Many thanks in advance. >There are a lot of issues related to this that will require a good bit of study, both in survival analysis and in regression. I would start with bootstrapping the ranks of the likelihood ratio chi-square statistics of the competing biomarkers. Frank -- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Ravi Varadhan wrote:> Frank, > > Is there an article that discusses this idea of bootstrapping the ranks of > the likelihood ratio chi-square > Statistics to assess relative importance of predictors in time-to-event data > (specifically Cox PH model)? > > Thanks, > Ravi.Do require(rms); ?anova.rms and see related articles: @Article{hal09usi, author = {Hall, Peter and Miller, Hugh}, title = {Using the bootstrap to quantify the authority of an empirical ranking}, journal = Annals of Stat, year = 2009, volume = 37, number = {6B}, pages = {3929-3959}, annote = {confidence interval for ranks;genomics;high dimension;independent component bootstrap;$m$-out-of-$n$ bootstrap;ordering;overlap interval;prediction interval;synchronous bootstrap;ordinary bootstrap may not provide accurate confidence intervals for ranks;may need a different bootstrap if the number of parameters being ranked increases with $n$ or is large;estimating $m$ is difficult;in their first example, where $m=0.355n$, the ordinary bootstrap provided a lower bound to the lengths of more accurate confidence intervals of ranks} } @Article{xie09con, author = {Xie, Minge and Singh, Kesar and Zhang, {Cun-Hui}}, title = {Confidence intervals for population ranks in the presence of ties and near ties}, journal = JASA, year = 2009, volume = 104, number = 486, pages = {775-787}, annote = {bootstrap ranks;ranking;nonstandard bootstrap inference;rank inference;slow convergence rate;smooth ranks in the presence of near ties;rank inference for fixed effects risk adjustment models} }> > -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of Frank E Harrell Jr > Sent: Tuesday, March 30, 2010 3:57 PM > To: Michal Figurski > Cc: r-help at r-project.org > Subject: Re: [R] Problem comparing hazard ratios > > Michal Figurski wrote: >> Dear R-Helpers, >> >> I am a novice in survival analysis. I have the following code: >> for (i in 3:12) print(coxph(Surv(time, status)~a[,i], data=a)) >> >> I used it to fit the Cox Proportional Hazard models separately for every >> available parameter (columns 3:12) in my data set - with intention to >> compare the Hazard Ratios. >> >> However, some of my variables are in range 0.1 to 1.6, others in range >> 5000 to 9000. How do I compare HRs between such variables? >> >> I have rescaled all the variables to be in 0 to 1 range - is this the >> proper way to go? Is there a way to somehow calculate the same HRs (as >> for rescaled parameters) from the HRs for original parameters? >> >> Many thanks in advance. >> > > There are a lot of issues related to this that will require a good bit > of study, both in survival analysis and in regression. I would start > with bootstrapping the ranks of the likelihood ratio chi-square > statistics of the competing biomarkers. > > Frank >-- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University