Dylan Beaudette
2007-May-14 23:38 UTC
[R] cross-validation / sensitivity anaylsis for logistic regression model
Hi, I have developed a logistic regression model in the form of (factor_1~ numeric + factor_2) and would like to perform a cross-validation or some similar form of sensitivity analysis on this model. using cv.glm() from the boot package: # dataframe from which model was built in 'z' # model is called 'm_geo.lrm' # as suggested in the man page for a binomial model: cost <- function(r, pi=0) mean(abs(r-pi)>0.5) cv.10.err <- cv.glm(z, m_geo.lrm, cost, K=10)$delta I get the following: cv.10.err 1 1 0.275 0.281 Am I correct in interpreting that this is the mean estimated error percentage for this specified model, after 10 runs of the cross-validation? any tips on understanding the output from cv.glm() would be greatly appreciated. I am mostly looking to perform a sensitivity analysis with this model and dataset - perhaps there are other methods? thanks -- Dylan Beaudette Soils and Biogeochemistry Graduate Group University of California at Davis 530.754.7341
Cody_Hamilton at Edwards.com
2007-May-14 23:49 UTC
[R] cross-validation / sensitivity anaylsis for logistic regression model
Dylan, You might like the validate() function in the Design library. It validates several model indeces (e.g. R^2) using resampling. There is some discussion on this function (as well as on validating your model via resampling) in the book on S programming by Carlos Alzola and Frank Harrell (available at http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf). Regards, -Cody Dylan Beaudette <dylan.beaudette@ gmail.com> To Sent by: r-help at stat.math.ethz.ch r-help-bounces at st cc at.math.ethz.ch Subject [R] cross-validation / sensitivity 05/14/2007 04:38 anaylsis for logistic regression PM model Please respond to dylan.beaudette at g mail.com Hi, I have developed a logistic regression model in the form of (factor_1~ numeric + factor_2) and would like to perform a cross-validation or some similar form of sensitivity analysis on this model. using cv.glm() from the boot package: # dataframe from which model was built in 'z' # model is called 'm_geo.lrm' # as suggested in the man page for a binomial model: cost <- function(r, pi=0) mean(abs(r-pi)>0.5) cv.10.err <- cv.glm(z, m_geo.lrm, cost, K=10)$delta I get the following: cv.10.err 1 1 0.275 0.281 Am I correct in interpreting that this is the mean estimated error percentage for this specified model, after 10 runs of the cross-validation? any tips on understanding the output from cv.glm() would be greatly appreciated. I am mostly looking to perform a sensitivity analysis with this model and dataset - perhaps there are other methods? thanks -- Dylan Beaudette Soils and Biogeochemistry Graduate Group University of California at Davis 530.754.7341 ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.