thr3ads.net - R help - [R] Survival Analysis and Predict time-to-death [Aug 2015]

If this information is useful, please help other people find it:
Share via:

survivalUser

2015-Aug-17 19:10 UTC

[R] Survival Analysis and Predict time-to-death

Dear All,

I would like to build a model, based on survival analysis on some data, that
is able to predict the /*expected time until death*/ for a new data
instance.

Data
For each individual in the population I have the, for each unit of time, the
status information and several continuous covariates for that particular
time. The data is right censored since at the end of the time interval
analyzed, instances could be still alive and die later.

Model
I created the model using R and the survreg function:

lfit <- survreg(Surv(time, status) ~ X) 

where:
- time is the time vector
- status is the status vector (0 alive, 1 death)
- X is a bind of multiple vectors of covariates

Predict time to death
Given a new individual with some covariates values, I would like to predict
the estimated time to death. In other words, the number of time units for
which the individual will be still alive till his death.

I think I can use this:

ptime <- predict(lfit, newdata=data.frame(X=NEWDATA),
type='response')

Is that correct? Am I going to get the expected-time-to-death that I would
like to have?

In theory, I could provide also the time information (the time when the
individual has those covariates values), should I simply add that in the
newdata:

ptime <- predict(lfit, newdata=data.frame(time=TIME, X=NEWDATA),
type='response')

Is that correct? Is this going to improve the prediction? (for my data, the
time already passed should be an important variable).

Any other suggestions or comments?

Thank you!



--
View this message in context:
http://r.789695.n4.nabble.com/Survival-Analysis-and-Predict-time-to-death-tp4711198.html
Sent from the R help mailing list archive at Nabble.com.

David Winsemius

2015-Aug-17 20:51 UTC

head link

[R] Survival Analysis and Predict time-to-death

On Aug 17, 2015, at 12:10 PM, survivalUser wrote:
> Dear All,
> 
> I would like to build a model, based on survival analysis on some data,
that
> is able to predict the /*expected time until death*/ for a new data
> instance.
Are you sure you want to use life expectancy as the outcome? In order to
establish a mathematical expectation  you need to have know the risk at all time
in the future, which as pointed out in the print.survfit help page is undefined
unless the last observation is a death. Very few datasets support such an
estimate. If on the other hand you have sufficient events in the future, then
you may be able to more readily justify an estimate of a median survival.

The print.survfit function does give choices of a "restricted mean
survival" or time-to-median-survival as estimate options. See that
function's help page.
> Data
> For each individual in the population I have the, for each unit of time,
the
> status information and several continuous covariates for that particular
> time. The data is right censored since at the end of the time interval
> analyzed, instances could be still alive and die later.
> 
> Model
> I created the model using R and the survreg function:
> 
> lfit <- survreg(Surv(time, status) ~ X) 
> 
> where:
> - time is the time vector
> - status is the status vector (0 alive, 1 death)
> - X is a bind of multiple vectors of covariates
> 
> Predict time to death
> Given a new individual with some covariates values, I would like to predict
> the estimated time to death. In other words, the number of time units for
> which the individual will be still alive till his death.
> 
> I think I can use this:
> 
> ptime <- predict(lfit, newdata=data.frame(X=NEWDATA),
type='response')
I don't see type="response" as a documented option in the
`?predict.survreg` help page. Were you suggesting that code on the basis of some
tutorial?
> Is that correct? Am I going to get the expected-time-to-death that I would
> like to have?
Most people would be using `survfit` to construct survival estimates.
> 
> In theory, I could provide also the time information (the time when the
> individual has those covariates values), should I simply add that in the
> newdata:
> 
> ptime <- predict(lfit, newdata=data.frame(time=TIME, X=NEWDATA),
> type='response')
> 
> Is that correct?
This sounds like you are considering time-varying predictors. Adding them as a
'newdata' argument is most definitely not the correct method. As such I
would ask if you really wanted to use a parametric survival model in the first
place? The coxph function has facilities for time-varying covariates.

> Is this going to improve the prediction?
It would most likely severely complicate prediction. Survival estimates may be
more problematic in that case on theoretical grounds.
> (for my data, the
> time already passed should be an important variable).
> 
> Any other suggestions or comments?
> 
> Thank you!
> 
R-help at r-project.org

The real Rhelp mailing list  ....   not the impostor Rhelp at Nabble

-- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 

David Winsemius
Alameda, CA, USA

survivalUser

2015-Aug-17 21:18 UTC

head link

[R] Survival Analysis and Predict time-to-death

Thank you David for your answer.

Some follow-up questions:

- So, do you think that try to estimate the life expectancy would be risky
and probably not justifiable? Is there some sort of 'confidence' that
the
model could give me for a prediction?

- type=response - I found it here: 
https://stat.ethz.ch/R-manual/R-devel/library/survival/html/predict.survreg.html

I have not tried it yet, but I was planning to use that because it says that
predict the "original scale of the data".

- Yes, I think they are time-varying predictors. Would you suggest other
models? (coxph?)

Overall, do you think this analysis is feasible/correct? Predicting how much
time a new individual (with those covariates) will be alive till death, is a
reasonable thing to predict with survival model?

Thank you again!




--
View this message in context:
http://r.789695.n4.nabble.com/Survival-Analysis-and-Predict-time-to-death-tp4711198p4711207.html
Sent from the R help mailing list archive at Nabble.com.

Bert Gunter

2015-Aug-17 22:39 UTC

head link

[R] Survival Analysis and Predict time-to-death

David:

I may have misunderstood you here, specifically:

"As such I would ask if you really wanted to use a parametric survival
model in the first place? "

The K-M curve is , of course, a **non-parametric** fit, and that is
why there can be no mean survival time unless the last point is a
death.

If you use the sample data to estimate a **parametric** model, then,
of course, you can estimate mean survival time (at any covariate
value) as the mean of the predicted parameter estimates (e.g. through
a link function).

I would certainly agree that the OP seems pretty confused about all
this. And apologies if I have misunderstood.

Cheers,
Bert


Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Mon, Aug 17, 2015 at 1:51 PM, David Winsemius <dwinsemius at
comcast.net> wrote:>
> On Aug 17, 2015, at 12:10 PM, survivalUser wrote:
>
>> Dear All,
>>
>> I would like to build a model, based on survival analysis on some data,
that
>> is able to predict the /*expected time until death*/ for a new data
>> instance.
>
> Are you sure you want to use life expectancy as the outcome? In order to
establish a mathematical expectation  you need to have know the risk at all time
in the future, which as pointed out in the print.survfit help page is undefined
unless the last observation is a death. Very few datasets support such an
estimate. If on the other hand you have sufficient events in the future, then
you may be able to more readily justify an estimate of a median survival.
>
> The print.survfit function does give choices of a "restricted mean
survival" or time-to-median-survival as estimate options. See that
function's help page.
>
>> Data
>> For each individual in the population I have the, for each unit of
time, the
>> status information and several continuous covariates for that
particular
>> time. The data is right censored since at the end of the time interval
>> analyzed, instances could be still alive and die later.
>>
>> Model
>> I created the model using R and the survreg function:
>>
>> lfit <- survreg(Surv(time, status) ~ X)
>>
>> where:
>> - time is the time vector
>> - status is the status vector (0 alive, 1 death)
>> - X is a bind of multiple vectors of covariates
>>
>> Predict time to death
>> Given a new individual with some covariates values, I would like to
predict
>> the estimated time to death. In other words, the number of time units
for
>> which the individual will be still alive till his death.
>>
>> I think I can use this:
>>
>> ptime <- predict(lfit, newdata=data.frame(X=NEWDATA),
type='response')
>
> I don't see type="response" as a documented option in the
`?predict.survreg` help page. Were you suggesting that code on the basis of some
tutorial?
>
>> Is that correct? Am I going to get the expected-time-to-death that I
would
>> like to have?
>
> Most people would be using `survfit` to construct survival estimates.
>
>>
>> In theory, I could provide also the time information (the time when the
>> individual has those covariates values), should I simply add that in
the
>> newdata:
>>
>> ptime <- predict(lfit, newdata=data.frame(time=TIME, X=NEWDATA),
>> type='response')
>>
>> Is that correct?
>
> This sounds like you are considering time-varying predictors. Adding them
as a 'newdata' argument is most definitely not the correct method. As
such I would ask if you really wanted to use a parametric survival model in the
first place? The coxph function has facilities for time-varying covariates.
>
>
>> Is this going to improve the prediction?
>
> It would most likely severely complicate prediction. Survival estimates may
be more problematic in that case on theoretical grounds.
>
>> (for my data, the
>> time already passed should be an important variable).
>>
>> Any other suggestions or comments?
>>
>> Thank you!
>>
>
> R-help at r-project.org
>
> The real Rhelp mailing list  ....   not the impostor Rhelp at Nabble
>
> -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> --
>
> David Winsemius
> Alameda, CA, USA
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius

2015-Aug-17 23:44 UTC

head link

[R] Survival Analysis and Predict time-to-death

On Aug 17, 2015, at 1:51 PM, David Winsemius wrote:
> 
> On Aug 17, 2015, at 12:10 PM, survivalUser wrote:
> 
>> Dear All,
>> 
>> I would like to build a model, based on survival analysis on some data,
that
>> is able to predict the /*expected time until death*/ for a new data
>> instance.
> 
> Are you sure you want to use life expectancy as the outcome? In order to
establish a mathematical expectation  you need to have know the risk at all time
in the future, which as pointed out in the print.survfit help page is undefined
unless the last observation is a death. Very few datasets support such an
estimate. If on the other hand you have sufficient events in the future, then
you may be able to more readily justify an estimate of a median survival.
Dear survivalUser;

I've been reminded that you later asked for a parametric model built with
survreg. The above commentary applies to the coxph models and objects and not to
survreg objects. If you do have a parametric model, even with incomplete
observation then calculating life expectancy should be a simple matter of
plugging the parameters for the distribution's mean value, since
life-expectancy is the statistical mean. So maybe you do want such a modle. The
default survreg  distribution is "weibull" so just go to your
mathematical statistics text and look up the formula for the mean of a Weibull
distribution with the estimated parameters.

-- 
David.
> 
> The print.survfit function does give choices of a "restricted mean
survival" or time-to-median-survival as estimate options. See that
function's help page.
> 
>> Data
>> For each individual in the population I have the, for each unit of
time, the
>> status information and several continuous covariates for that
particular
>> time. The data is right censored since at the end of the time interval
>> analyzed, instances could be still alive and die later.
>> 
>> Model
>> I created the model using R and the survreg function:
>> 
>> lfit <- survreg(Surv(time, status) ~ X) 
>> 
>> where:
>> - time is the time vector
>> - status is the status vector (0 alive, 1 death)
>> - X is a bind of multiple vectors of covariates
>> 
>> Predict time to death
>> Given a new individual with some covariates values, I would like to
predict
>> the estimated time to death. In other words, the number of time units
for
>> which the individual will be still alive till his death.
>> 
>> I think I can use this:
>> 
>> ptime <- predict(lfit, newdata=data.frame(X=NEWDATA),
type='response')
> 
> I don't see type="response" as a documented option in the
`?predict.survreg` help page. Were you suggesting that code on the basis of some
tutorial?
> 
>> Is that correct? Am I going to get the expected-time-to-death that I
would
>> like to have?
> 
> Most people would be using `survfit` to construct survival estimates.
> 
>> 
>> In theory, I could provide also the time information (the time when the
>> individual has those covariates values), should I simply add that in
the
>> newdata:
>> 
>> ptime <- predict(lfit, newdata=data.frame(time=TIME, X=NEWDATA),
>> type='response')
>> 
>> Is that correct?
> 
> This sounds like you are considering time-varying predictors. Adding them
as a 'newdata' argument is most definitely not the correct method. As
such I would ask if you really wanted to use a parametric survival model in the
first place? The coxph function has facilities for time-varying covariates.
> 
> 
>> Is this going to improve the prediction?
> 
> It would most likely severely complicate prediction. Survival estimates may
be more problematic in that case on theoretical grounds.
> 
>> (for my data, the
>> time already passed should be an important variable).
>> 
>> Any other suggestions or comments?
>> 
>> Thank you!
>> 
> 
> R-help at r-project.org
> 
> The real Rhelp mailing list  ....   not the impostor Rhelp at Nabble
> 
> -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> 
> David Winsemius
> Alameda, CA, USA
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA

R help - Aug 2015 - Survival Analysis and Predict time-to-death

[R] Survival Analysis and Predict time-to-death

[R] Survival Analysis and Predict time-to-death

[R] Survival Analysis and Predict time-to-death

[R] Survival Analysis and Predict time-to-death

[R] Survival Analysis and Predict time-to-death