Dear R-help,
I am using R version 3.4.0 within Windows, and survival 2.41-3.  I have fit a
Prentice Williams and Peterson-Counting Process model to my data as shown below.
This is basically an extension of the Cox model for interval censored data.  My
dataset, bdat5 can be found here:
https://drive.google.com/open?id=1sQSBEe1uBzh_gYbcj4P5Kuephvalc3gh
cfitcp2 <-
coxph(Surv(start,stop,status)~sex+rels+factor(treat)+log(age)+log(tcrate3+0.01)+cluster(trialno)+strata(enum),data=bdat5,model=TRUE,x=TRUE,y=TRUE)
I would now like to use the model to predict the probability of zero events by
two years - this is equivalent to the survival probability at 2 years I believe.
This is so that I can compare the output to similar estimates obtained from
negative binomial, and zero-inflated negative binomial models for the same data
(albeit in a different format)
To my mind, and based on what I've read, the best way to do this is to use
survfit.  I want to make predictions for each individual, therefore, I have
tried this code:
  
trialnos <- unique(bdat5$trialno) 
prob0 <- function(ids,dataset,model,time){
		probs <- rep(0,length(ids))
		for(i in 1:length(ids)){
		print(i)
		sdata <- subset(dataset,trialno==ids[i])
		sfit <- survfit(model,newdata=sdata)
		probs[i] <-sum(summary(sfit,time)$surv)
		}
		return(probs)
		}
prob0ests <- prob0(trialnos,bdat5,cfitcp2,730)
When I do this for the first three trial numbers I get:
0.3001021 2993.4531767    0.3445589
The unusually large "probability" arises when there is only 1 row of
data for the relevant trial number.  Can anyone therefore explain why there is a
problem when "sdata" is only 1 row, and ideally provide a solution?
Many thanks,
Laura
Dr Laura Bonnett
NIHR Post-Doctoral Fellow
Department of Biostatistics,
Waterhouse Building, Block F,
1-5 Brownlow Street,
University of Liverpool,
Liverpool,
L69 3GL
0151 795 9686
L.J.Bonnett at liverpool.ac.uk