Dear all,
I'm trying to fit a curve to some 1 year failure-time data, so that I can
extrapolate and predict failure rates up to 3 years. The data is in the
general form:
Treatment Time Status
Treatment A 28 0
Treatment B 28 0
Treatment B 28 0
Treatment A 28 1
Treatment B 56 0
Treatment A 56 0
Treatment B 84 0
Treatment A 84 0
Treatment A 84 0
etc etc
Where time is in days and goes up to 360, 0 represents a withdrawal and 1
represents a failure. The code I've tried is as follows:
survdata<-read.csv("Data.csv")
WeiModel<-survreg(Surv(survdata$Time,survdata$Status)~survdata$Treatment)
summary(WeiModel)
n<-seq(0, 1080, 30 )
##Compute Weibull CDF
A<-pweibull(n, scale=exp(WeiModel$coef[1]), shape=1/WeiModel$scale)
B<-pweibull(n,
scale=exp(WeiModel$coef[1]+WeiModel$coef[2]),shape=1/WeiModel$scale)
##Transform into survivor function
ACurv<-1-A
BCurv<-1-B
The problem is, that the Weibull curve is predicting a survival probability
for treatment B of around 11% at the end of year 3 (day 1080), and past
analyses have shown that this value should be somewhere in the region of
~40%. I'm finding a similar problem for treatment A also. I've also
tried
fitting an exponential distribution but encountered the same problems. My
questions are:
1) Have I computed the survivor function correctly, in terms of moving
from the survreg output to the pweibull parameters?
2) If the above seems correct, then is there an obvious methodological
reason why I'm getting survival probabilities far less than expected? (for
example, is the general method I'm using not the best approach? If so can
you suggest a better one?)
3) Is it worth trying to do the same thing with the other distributions
that R uses for survival analysis (lognormal, gaussian etc). If so, how do
you extrapolate the model out to 3 years? Are there functions that are
equivalent to the 'pweibull' function for all these distributions?
Any help you could give on this would be most appreciated.
Kind regards,
Tim.
--
View this message in context:
http://r.789695.n4.nabble.com/Survival-analysis-extrapolation-tp2231735p2231735.html
Sent from the R help mailing list archive at Nabble.com.
A few suggestions
1. Use the data argument
wfit <- survreg(Surv(Time, Status) ~ Treatment, data=survdata)
summary(wfit)
2. Use the predict function to look at your survival curve. This
avoids any parameter change mistakes. It also works for any
distribution.
pp <- 0:60/100
wsurv <- predict(wfit, type='quantile', p=pp,
newdata=survdata[1:2,])
matplot(t(wsurv), 1-pp, xlab="time", ylab="survival",
type='l')
Don't bother to ask for the 100th percentile -- it is infinity. I chose
0-60 based on your questions.
(Why newdata=survdata[1:2,]? The default for predict is to give a curve
for every observation in the original data. You only need 2, one for a
subject on A and one for a subject on B, and your test data showed lines
1&2 of the data had one of each).
3. Overlay a Kaplan-Meier, to partially answer your third question.
kmfit <- survfit(Surv(Time, Status) ~ Treatment, data=survdata)
lines(kmfit, mark.time=F, col=1:2)
Terry Therneau
--- begin included message --
I'm trying to fit a curve to some 1 year failure-time data, so that I
can
extrapolate and predict failure rates up to 3 years. The data is in the
general form:
Treatment Time Status
Treatment A 28 0
Treatment B 28 0
Treatment B 28 0
Treatment A 28 1
Treatment B 56 0
Treatment A 56 0
Treatment B 84 0
Treatment A 84 0
Treatment A 84 0
etc etc
Where time is in days and goes up to 360, 0 represents a withdrawal and
1
represents a failure. The code I've tried is as follows:
survdata<-read.csv("Data.csv")
WeiModel<-survreg(Surv(survdata$Time,survdata$Status)~survdata
$Treatment)
summary(WeiModel)
n<-seq(0, 1080, 30 )
##Compute Weibull CDF
A<-pweibull(n, scale=exp(WeiModel$coef[1]), shape=1/WeiModel$scale)
B<-pweibull(n,
scale=exp(WeiModel$coef[1]+WeiModel$coef[2]),shape=1/WeiModel$scale)
##Transform into survivor function
ACurv<-1-A
BCurv<-1-B
The problem is, that the Weibull curve is predicting a survival
probability
for treatment B of around 11% at the end of year 3 (day 1080), and past
analyses have shown that this value should be somewhere in the region of
~40%. I'm finding a similar problem for treatment A also. I've also
tried
fitting an exponential distribution but encountered the same problems.
My
questions are:
1) Have I computed the survivor function correctly, in terms of
moving
from the survreg output to the pweibull parameters?
2) If the above seems correct, then is there an obvious
methodological
reason why I'm getting survival probabilities far less than expected?
(for
example, is the general method I'm using not the best approach? If so
can
you suggest a better one?)
3) Is it worth trying to do the same thing with the other
distributions
that R uses for survival analysis (lognormal, gaussian etc). If so, how
do
you extrapolate the model out to 3 years? Are there functions that are
equivalent to the 'pweibull' function for all these distributions?
Any help you could give on this would be most appreciated.
Kind regards,
Tim.
Dear Terry, Thanks so much for your help; I'm a bit of an R novice at the moment (as you can probably tell from my failure to use the data argument!) so any help is most welcome. I'm hoping to use this model to generate transition probabilities for a Markov model and, as such, I was wondering if there is an easy way of returning the probability of failure at discrete time intervals, (i.e. 360, 390, 420, 450...etc)? Kind regards, Tim. -- View this message in context: http://r.789695.n4.nabble.com/Survival-analysis-extrapolation-tp2231735p2234691.html Sent from the R help mailing list archive at Nabble.com.