Terry Therneau
2011-Dec-12 19:30 UTC
[R] k-folds cross validation with conditional logistic
--begin inclusion -- I have a matched-case control dataset that I'm using conditional logistic regression (clogit in survival) to analyze. I'm trying to conduct k-folds cross validation on my top models but all of the packages I can find (CVbinary in DAAG, KVX) won't work with clogit models. Is there any easy way to do this in R? -end inclusion -- The clogit funciton is simply a wrapper for coxph. clogit(case ~ ... turns into coxph(Surv(dummy, case) ~ ... where "dummy" is a vector of ones. Do the packages support coxph models? Terry T
cberry at tajo.ucsd.edu
2011-Dec-13 16:58 UTC
[R] k-folds cross validation with conditional logistic
Terry Therneau <therneau at mayo.edu> writes:> --begin inclusion -- > I have a matched-case control dataset that I'm using conditional > logistic regression (clogit in survival) to analyze. I'm trying to > conduct k-folds cross validation on my top models but all of the > packages I can find (CVbinary in DAAG, KVX) won't work with clogit > models. Is there any easy way to do this in R? > -end inclusion -- > > The clogit funciton is simply a wrapper for coxph. > clogit(case ~ ... > turns into > coxph(Surv(dummy, case) ~ ... > where "dummy" is a vector of ones. > > Do the packages support coxph models?Terry, I do not know the answer to the question you posed, but I suspect the answer is no. The cross-validation would need to be done stratum-wise, but that does not seem to be supported by predict.coxph():> fit <- clogit(case~spontaneous+induced+strata(stratum),data=infert) > train.sans.1 <- update(fit,subset=stratum!=1) > predict.1 <- predict(train.sans.1,newdat=subset(infert,stratum==1))Error in predict.coxph(train.sans.1, newdat = subset(infert, stratum == : New data has a strata not found in the original model One can work around this:> train.sans.1.alt <- update(fit,subset= stratum!=1 | case == 1 ) > all(coef(train.sans.1),coef(train.sans.1.alt))[1] TRUE> predict.1.alt <- predict(train.sans.1.alt,newdat=subset(infert,stratum==1))but the predicted values are not centered in each stratum as usual with strata in predict.coxph (if that matters):> predict.1.alt1 84 166 0.000000 -2.527759 -2.527759> predict.1.alt - mean(predict.1.alt)1 84 166 1.6851724 -0.8425862 -0.8425862 Best, Chuck> > Terry T >-- Charles C. Berry Dept of Family/Preventive Medicine ccberry at ucsd dot edu UC San Diego http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901