Case cohort function cch() is in survival package. In cch(), the prentice
method is implemented like this:
Prentice <- function(tenter, texit, cc, id, X, ntot,robust){
eps <- 0.00000001
cens <- as.numeric(cc>0) # Censorship indicators
subcoh <- as.numeric(cc<2) # Subcohort indicators
## Calculate Prentice estimate
ent2 <- tenter
ent2[cc==2] <- texit[cc==2]-eps
fit1 <- coxph(Surv(ent2,texit,cens)~X,eps=eps,x=TRUE)
## Calculate Prentice estimate and variance
nd <- sum(cens) # Number of failures
nc <- sum(subcoh) # Number in subcohort
ncd <- sum(cc==1) #Number of failures in subcohort
X <- as.matrix(X)
aent <- c(tenter[cc>0],tenter[cc<2])
aexit <- c(texit[cc>0],texit[cc<2])
aX <- rbind(as.matrix(X[cc>0,]),as.matrix(X[cc<2,]))
aid <- c(id[cc>0],id[cc<2])
dum <- rep(-100,nd)
dum <- c(dum,rep(0,nc))
gp <- rep(1,nd)
gp <- c(gp,rep(0,nc))
fit <-
coxph(Surv(aent,aexit,gp)~aX+offset(dum)+cluster(aid),eps=eps,x=TRUE,
iter.max=35,init=fit1$coefficients)
db <- resid(fit,type="dfbeta")
db <- as.matrix(db)
db <- db[gp==0,]
fit$phase2var<-(1-(nc/ntot))*t(db)%*%(db)
fit$naive.var <- fit$naive.var+fit$phase2var
fit$var<-fit$naive.var
fit$coefficients <- fit$coef <- fit1$coefficients
fit
}
The first fit1<-coxph() estimate the coefficient and second fit<-coxph()
which seems a SelfPrentice method to estimate the variance. My question is
why second coxph() estimate Prentice variance using SelfPrentice method.
Should the jacknife variance of Prentice be implemented like: fit1 <-
coxph(Surv(ent2,texit,cens)~X+cluster(id),eps=eps,x=TRUE)
Th other question is unique id is not necessary in whole program, Can we
just allow multiple ids without affecting the final result?
[[alternative HTML version deleted]]