thr3ads.net - R help - [R] Slow survfit -- is there a faster alternative? [Dec 2009]

If this information is useful, please help other people find it:
Share via:

gregory.bronner at barclayscapital.com

2009-Dec-22 01:20 UTC

[R] Slow survfit -- is there a faster alternative?

Using R 2.10 on Windows:

I have a filtered database of 650k event observations in a data frame
with 20+ variables.

I'd like to be able to quickly generate estimate and plot survival
curves. However the survfit and cph() functions are extremely slow.


As an example: I tried 

results.cox<-coxph(Surv(duration, success) ~ start_time + factor1+
factor2+ variable3, data=filteredData) #(took a few seconds)

plot(results.cox)
#(never finished in an hour)

I also tried the cph() function, with similar results.


Is there some easier quick-and-dirty way of producing and plotting
survival curves for large data sets? I've seen some references on this
list that suggest that the underlying algorithm is O(numObs *
numSuccesses) and could be sped up. Has this been done?

Thanks,
_______________________________________________

This e-mail may contain information that is confidential, privileged or
otherwise protected from disclosure. If you are not an intended recipient of
this e-mail, do not duplicate or redistribute it by any means. Please delete it
and any attachments and notify the sender that you have received it in error.
Unless specifically indicated, this e-mail is not an offer to buy or sell or a
solicitation to buy or sell any securities, investment products or other
financial product or service, an official confirmation of any transaction, or an
official statement of Barclays. Any views or opinions presented are solely those
of the author and do not necessarily represent those of Barclays. This e-mail is
subject to terms available at the following link:
barcap.com/emaildisclaimer. By messaging with Barclays you consent to the
foregoing.  Barclays Capital is the investment banking division of Barclays Bank
PLC, a company registered in England (number 1026167) with its registered office
at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from
other members of the Barclays Group.

David Winsemius

2009-Dec-22 01:59 UTC

head link

[R] Slow survfit -- is there a faster alternative?

On Dec 21, 2009, at 8:20 PM, <gregory.bronner at barclayscapital.com>  
wrote:
> Using R 2.10 on Windows:
>
> I have a filtered database of 650k event observations in a data frame
> with 20+ variables.
>
> I'd like to be able to quickly generate estimate and plot survival
> curves. However the survfit and cph() functions are extremely slow.
>
>
> As an example: I tried
>
> results.cox<-coxph(Surv(duration, success) ~ start_time + factor1+
> factor2+ variable3, data=filteredData) #(took a few seconds)
>
> plot(results.cox)
> #(never finished in an hour)
Something is wrong here. I use cph (from the Design package) on  
datasets numbering in the millions with crossed spline terms and the  
plots are virtually immediate.>
> I also tried the cph() function, with similar results.
The plot.Design function needs more than just the fit as an argument,  
so you are not providing enough information for good advice. When I  
try to plot with an object produced by coxph with no further  
arguments, I get an error.

 > plot(survHb)
Error in xy.coords(x, y, xlabel, ylabel, log) :
   'x' and 'y' lengths differ

What happens when you use predict or survfit to process the fit objects:

?survfit.coxph

E.g.:
 > survHb <- coxph(Surv(surv.yr, death) ~ age+nsmkr +sexMF + HbPr2 +  
GGT, data=hisub)
 > plot(Hb)
Error: object 'Hb' not found
Error in plot(Hb) :
   error in evaluating the argument 'x' in selecting a method for  
function 'plot'

#this however, succeeds --->

 > sfit<-survfit(survHb)
 > plot(sfit)

So you need to supply a process form of a fit.

>
>
> Is there some easier quick-and-dirty way of producing and plotting
> survival curves for large data sets? I've seen some references on this
> list that suggest that the underlying algorithm is O(numObs *
> numSuccesses) and could be sped up. Has this been done?
>
> Thanks,


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

Reasonably Related Threads

Search for more seemingly similar threads

R help - Dec 2009 - Slow survfit -- is there a faster alternative?

[R] Slow survfit -- is there a faster alternative?

[R] Slow survfit -- is there a faster alternative?

Reasonably Related Threads