thr3ads.net - R help - [R] Modelling survival with time-dependent covariates [Jul 2010]

If this information is useful, please help other people find it:
Share via:

Ben Rhelp

2010-Jul-01 19:28 UTC

[R] Modelling survival with time-dependent covariates

Hi all,

I am looking at the tutorial/appendix from John Fox on ?Cox Proportional-Hazards
Regression for Survival Data? available here:
http://cran.r-project.org/doc/contrib/Fox-Companion/appendix-cox-regression.pdf
I am particularly interested in modelling survival with time-dependent
covariates (Section 4).
 
The data look like this:>  Rossi.2[1:50,]start
stop arrest.time week arrest fin age race wexp mar paro prio educ employed
0 1 0 20 1 0 27 1 0 0 1 3 3 0
1 2 0 20 1 0 27 1 0 0 1 3 3 0
...
18 19 0 20 1 0 27 1 0 0 1 3 3 0
19 20 1 20 1 0 27 1 0 0 1 3 3 0
0 1 0 17 1 0 18 1 0 0 1 8 4 0
1 2 0 17 1 0 18 1 0 0 1 8 4 0
...
15 16 0 17 1 0 18 1 0 0 1 8 4 0
16 17 1 17 1 0 18 1 0 0 1 8 4 0
0 1 0 25 1 0 19 0 1 0 1 13 3 0
1 2 0 25 1 0 19 0 1 0 1 13 3 0
...
3.13 12 13 0 25 1 0 19
0 1 0 1 13 3 0
 
John suggests the following model:
mod.allison.2 <- coxph(Surv(start, stop, arrest.time) ~
+ fin + age + race + wexp + mar + paro + prio + employed,
+ data=Rossi.2)
 1-Would informing the algorithm coxph which samples represents the same person
(through the use of an Id for example) improve the ?efficiency? of the estimated
model? And if so, how should i do that? Using strata()?
 
2- He later suggests ?accommodating non-proportional hazards by building
interactions between covariates and time into the Cox regression model? as
follows:
 
mod.allison.5
<- coxph(Surv(start, stop, arrest.time) ~
+           fin + age + age:stop + prio,
+           data=Rossi.2)
 
I have read quite a lot of documentation to understand the meaning of ?age +
age:stop? in the formula, but I am unsure of what it means. If I wanted to 
visualise these variables which are entering the model, would it be something
like:
data.frame(Rossi.2$age,Rossi.2$age %in% Rossi.2$stop)
 
I hope this make sense. Thanks for your help,
Ben

Terry Therneau

2010-Jul-02 13:25 UTC

head link

[R] Modelling survival with time-dependent covariates

1-Would informing the algorithm coxph which samples represents the same
person (through the use of an Id for example) improve the ?efficiency?
of the estimated model? And if so, how should i do that? Using strata()?

 No, it makes no change. The reason is that the (start, stop] is just a
trick.  At each death time the program needs to figure out what the
covariates are for everyone else at that time; the start,stop lets it
pick the right line for each subject.  As long as there are no overlaps,
i.e. (0,20], (15, 50], then there is only one copy of the person, and no
'correlated data' issue.  (Overlap is wierd -- it corresponds to two
copies of me being in the room at the same time.)
 If there are multiple events for a subject, then there is correlation
(via a different mechanism), and addition of a cluster() term is needed.

2- He later suggests ?accommodating non-proportional hazards by building
interactions between covariates and time into the Cox regression model?
as follows:
 
 coxph(Surv(start, stop, arrest.time) ~fin + age + age:stop + prio, ...

This trick ONLY works if 
  a. the data set has been artificially divided (as your example has)
into small uniform time increments, the same for each subject.
  b. the form of the non-ph is actally a linear change in beta over
time.  Use cox.zph on the original model to look at this.  When I see
non-ph (the plot from cox.zph is not horizontal) life is rarely so
simple.

Terry Therneau

Maybe Matching Threads

Search for more seemingly similar threads

R help - Jul 2010 - Modelling survival with time-dependent covariates

[R] Modelling survival with time-dependent covariates

[R] Modelling survival with time-dependent covariates

Maybe Matching Threads