I encountered this problem about 18 months ago. I contacted Prof. Fox
and Dr. Malewski (the R package maintainers for mice) but they referred
me to Prof. van Buuren. I wrote to Prof. van Buuren but am unable to
find his reply (if he did reply).
Here are the functions I used at that time, if you want to take it with
lots of salt. Let me know if you find anything fishy with it.
coxph.mids <- function (formula, data, ...) {
call <- match.call()
if (!is.mids(data)) stop("The data must have class mids")
analyses <- as.list(1:data$m)
for (i in 1:data$m) {
data.i <- complete(data, i)
analyses[[i]] <- coxph(formula, data = data.i, ...)
}
object <- list(call = call, call1 = data$call,
nmis = data$nmis, analyses = analyses)
oldClass(object) <- if (.SV4.) "mira" else c("mira",
"coxph")
return(object)
}
And in the function 'pool', the small sample adjustment requires
residual degrees of freedom (i.e. dfc). For a cox model, I believe that
this is simply the number of events minus the regression coefficients.
There is support for this from middle of page 149 of the book by Parmer
& Machin (ISBN 0471936405). Please correct me if I am wrong.
Here is the slightly modified version of pool :
pool <- function (object, method = "smallsample") {
call <- match.call()
if (!is.mira(object)) stop("The object must have class
'mira'")
if ((m <- length(object$analyses)) < 2)
stop("At least two imputations are needed for pooling.\n")
analyses <- object$analyses
k <- length(coef(analyses[[1]]))
names <- names(coef(analyses[[1]]))
qhat <- matrix(NA, nrow = m, ncol = k, dimnames = list(1:m, names))
u <- array(NA, dim = c(m, k, k),
dimnames = list(1:m, names, names))
for (i in 1:m) {
fit <- analyses[[i]]
qhat[i, ] <- coef(fit)
u[i, , ] <- vcov(fit)
}
qbar <- apply(qhat, 2, mean)
ubar <- apply(u, c(2, 3), mean)
e <- qhat - matrix(qbar, nrow = m, ncol = k, byrow = TRUE)
b <- (t(e) %*% e)/(m - 1)
t <- ubar + (1 + 1/m) * b
r <- (1 + 1/m) * diag(b/ubar)
f <- (1 + 1/m) * diag(b/t)
df <- (m - 1) * (1 + 1/r)2
if (method == "smallsample") {
if( any( class(fit) == "coxph" ) ){
### this loop is the hack for survival analysis ###
status <- fit$y[ , 2]
n.events <- sum(status == max(status))
p <- length( coefficients( fit ) )
dfc <- n.events - p
} else {
dfc <- fit$df.residual
}
df <- dfc/((1 - (f/(m + 1)))/(1 - f) + dfc/df)
}
names(r) <- names(df) <- names(f) <- names
fit <- list(call = call, call1 = object$call, call2 = object$call1,
nmis = object$nmis, m = m, qhat = qhat, u = u,
qbar = qbar, ubar = ubar, b = b, t = t, r = r, df = df,
f = f)
oldClass(fit) <- if (.SV4.) "mipo" else c("mipo",
oldClass(object))
return(fit)
}
print.miro only gives the coefficients. Often I need the standard errors
as well since I want to test if each regression coefficient from
multiple imputation is zero or not. Since the function summary.mipo does
not exist, can I suggest the following
summary.mipo <- function(object){
if (!is.null(object$call1)){
cat("Call: ")
dput(object$call1)
}
est <- object$qbar
se <- sqrt(diag(object$t))
tval <- est/se
df <- object$df
pval <- 2 * pt(abs(tval), df, lower.tail = FALSE)
coefmat <- cbind(est, se, tval, pval)
colnames(coefmat) <- c("Estimate", "Std. Error",
"t value",
"Pr(>|t|)")
cat("\nCoefficients:\n")
printCoefmat( coefmat, P.values=T, has.Pvalue=T, signif.legend=T )
cat("\nFraction of information about the coefficients
missing due to nonresponse:",
"\n")
print(object$f)
ans <- list( coefficients=coefmat, df=df,
call=object$call1, fracinfo.miss=object$f )
invisible( ans )
}
Hope this helps.
Regards, Adai
Inman, Brant A. M.D. wrote:> R-helpers:
>
> I have a dataset that has 168 subjects and 12 variables. Some of the
> variables have missing data and I want to use the multiple imputation
> capabilities of the "mice" package to address the missing data.
Given
> that mice only supports linear models and generalized linear models (via
> the lm.mids and glm.mids functions) and that I need to fit Cox models, I
> followed the previous suggestion of John Fox and have created my own
> function "cox.mids" to use coxph to fit models to the imputed
datasets.
>
> (http://tolstoy.newcastle.edu.au/R/help/06/03/22295.html)
>
> The function I created is:
>
> ------------
>
> cox.mids <- function (formula, data, ...)
> {
> call <- match.call()
> if (!is.mids(data))
> stop("The data must have class mids")
> analyses <- as.list(1:data$m)
> for (i in 1:data$m) {
> data.i <- complete(data, i)
> analyses[[i]] <- coxph(formula, data = data.i, ...)
> }
> object <- list(call = call, call1 = data$call, nmis = data$nmis,
> analyses = analyses)
> oldClass(object) <- c("mira", "coxph")
> return(object)
> }
>
> ------------
>
> The problem that I encounter occurs when I try to use the "pool"
> function to pool the fitted models into one general model. Here is some
> code that reproduces the error using the pbc dataset.
>
> ------------
>
> d <-
pbc[,c('time','status','age','sex','hepmeg','platelet',
'trt',
> 'trig')]
> d[d==-9] <- NA
> d[,c(4,5,7)] <- lapply(d[,c(4,5,7)], FUN=as.factor)
> str(d)
>
> imp <- mice(d, m=10, maxit=10, diagnostics=T, seed=500,
> defaultImputationMethod=c('norm', 'logreg',
'polyreg'))
>
> fit <- cox.mids(Surv(time,status) ~ age + sex + hepmeg + platelet + trt
> + trig, imp)
>
> pool(fit)
>
> ------------
>
> I have looked at the "pool" function but cannot figure out what I
have
> done wrong. Would really appreciate any help with this.
>
> Thanks,
>
> Brant Inman
> Mayo Clinic
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>