Mayeul KAUFFMANN
2004-Jul-29 10:38 UTC
[R] proximity covariate to model dependence in cox model (question from Vito Muggeo)
(I received a messsage from Vito Muggeo [who sometimes posts here] about my previous posts here. Thus, it might be of interest here)> dear Mayeul, > I read your post on R -list about survival analysis with multiple (twoif I> am not right) events per subject.Sometimes subjetcs have even 3 events (i.e. civil wars for last 40 years)> 1)How did you calculate the pxcw variable?(Note: pxcm stands for ProXimity.of.lastCivil.War) the answer is at https://www.stat.math.ethz.ch/pipermail/r-help/2004-July/053556.html after the words "I used fit$loglik to chose a.chosen.parameter from 8 values, for 3 types of events:" I used (exp(-days.since.event.of.type.one/a1)), which gives an indicator between 1 and 0 that falls quickly first then more slowly (see the end of the above post, 053556.html) Simply only use event.of.type.one. I am not doing a competing risk analysis: I am only counting one type of event (type.one), but the covariates measure other events that may increase the risk. The difficulty is to choose the good a1 parameter. You will typically need values much lower than mine with your exapmle dataset. It then become: la<-c(263.5, 526.9,1053.9,2107.8,4215.6,8431.1) #list of values to choose. Adapt it from z<-NULL;for(a1 in la) {coxtmp <- (coxph(Surv(start,stop,status)~ +I(exp(-days.since.event.of.type.one/a1)) + other.time.dependent.covariates +cluster(id) ,data=x,robust=T)) rbind(z,c(a1,coxtmp$wald.test, coxtmp$rscore, coxtmp$loglik, coxtmp$score))->z } z <- data.frame(z) names(z) <- c("a1","wald.test", "rscore", "NULLloglik","loglik", "score") z[which.max(z$rscore),] z[which.max(z$loglik),]> Namely, for instance,given the following data set (standard in multiple > event formulation): > > data.frame(id=c(3,3,4,4,5),start=c(0,10,0,7,0),stop=c(10,15,7,20,9), > + event=c(1,1,1,0,0),stratum=c(1,2,1,2,1)) > id start stop event stratum > 1 3 0 10 1 1 > 2 3 10 15 1 2 > 3 4 0 7 1 1 > 4 4 7 20 0 2 > 5 5 0 9 0 1 > how the pxcw variable is computed for each *row* in the dataset? > should be something like "ifelse(stratum==1,0,(stop-start))", i.e. > (0,5,0,13,0) , or am I wrong?It looks computationnaly OK, but strange to me.You can choose the function that matches best the dependence, maybe it is. But, assuming risk jumps just after an event then decreases slowly , I would rather use a very high value when there is no previous event, for instance: ifelse(stratum==1,100,(stop-start)) Otherwise, a very close event (yesterday) would nearly be coded like no previous event! or even better, truncate the decrease : ifelse(stratum==1,100,ifelse(stop-start>100, 100,stop-start )) Here, after 100 days, it is like having no previous event. But you should divide your observations to account for the change in the proximity variable. Instead of:> id start stop event stratum proxim > 4 0 7 1 1 100 > 4 7 20 0 2 13use> id start stop event stratum proxim > 4 0 7 1 1 100 > 4 7 8 0 2 1 > 4 8 9 0 2 2. .> 4 19 20 0 2 13With the function I used, I used a covariate (days.since.event) which was coded 9999999 if no previous event, I need: exp((-days.since.event)/a1)> Is this a your idea or did you find it elsewhere? could you give me any > reference?The proximity covariate is from http://www.worldbank.org/research/conflict/papers/CivilPeace.pdf (But they do not use the counting process we use here. They only measure covariates for all countries when one country experiments an event. Thus, all the remaining is mine (I think))> > Hope you can help me, > regards, > vito >