Na'im R. Tyson
2010-Jan-22 08:53 UTC
[R] Computing Confidence Intervals for AUC in ROCR Package
Dear R-philes, I am plotting ROC curves for several cross-validation runs of a classifier (using the function below). In addition to the average AUC, I am interested in obtaining a confidence interval for the average AUC. Is there a straightforward way to do this via the ROCR package? plot_roc_curve <- function(roc.dat, plt.title) { #print(str(vowel.ROC)) pred <- prediction(roc.dat$predictions, roc.dat$labels) perf <- performance(pred, "tpr", "fpr") perf.auc <- performance(pred, "auc") perf.auc.areas <- slot(perf.auc, "y.values") curve.area <- mean(unlist(perf.auc.areas)) #quartz(width=4, height=6) plot(perf, col="grey82", lty=3) plot(perf,lwd=3,avg="horizontal",spread.estimate="boxplot", add=T) title(main=plt.title) mtext(sprintf("%s%1.4f", "Area under Curve = ", curve.area), side=3, line=0, cex=0.8) } P.S. After years of studying statistical analysis as a student, I still consider myself a novice.
David Winsemius
2010-Jan-22 13:21 UTC
[R] Computing Confidence Intervals for AUC in ROCR Package
On Jan 22, 2010, at 3:53 AM, Na'im R. Tyson wrote:> Dear R-philes, > > I am plotting ROC curves for several cross-validation runs of a > classifier (using the function below). In addition to the average > AUC, I am interested in obtaining a confidence interval for the > average AUC. Is there a straightforward way to do this via the ROCR > package?You should probably contact the authors. When I tried using that package a few weeks ago, several of the annotation features were broken. I contacted the author who said there had been problems after converting to S4 method. He also said there would be a fix but not immediately. There has been a release since that time and I tried it, but it did not appear to fix the problems I encountered. All I was able to get were very simple ROC curves without any confidence intervals or marking of levels. I ended up turning to the Epi package for what I needed ( but I did not need confidence intervals so cannot comment on that aspect.) -- David.> > plot_roc_curve <- function(roc.dat, plt.title) { > #print(str(vowel.ROC)) > pred <- prediction(roc.dat$predictions, roc.dat$labels) > perf <- performance(pred, "tpr", "fpr") > perf.auc <- performance(pred, "auc") > perf.auc.areas <- slot(perf.auc, "y.values") > curve.area <- mean(unlist(perf.auc.areas)) > #quartz(width=4, height=6) > plot(perf, col="grey82", lty=3) > plot(perf,lwd=3,avg="horizontal",spread.estimate="boxplot", > add=T) > title(main=plt.title) > mtext(sprintf("%s%1.4f", "Area under Curve = ", curve.area), > side=3, line=0, cex=0.8) > } > > P.S. After years of studying statistical analysis as a student, I > still consider myself a novice. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
Frank E Harrell Jr
2010-Jan-22 13:30 UTC
[R] Computing Confidence Intervals for AUC in ROCR Package
Even though ROC curves don't shed much light on the problem, the area under the ROC is useful because it is the Wilcoxon-type concordance probability. Denoting it by C, 2*(C-.5) is Somers' Dxy rank correlation between predictions and binary Y. You can get the standard error of Dxy from the Hmisc package rcorr.cens function, and backsolve for s.e. of C hence get a confidence interval for C. This uses U-statistics and is fairly assumption-free. Frank Na'im R. Tyson wrote:> Dear R-philes, > > I am plotting ROC curves for several cross-validation runs of a > classifier (using the function below). In addition to the average AUC, > I am interested in obtaining a confidence interval for the average AUC. > Is there a straightforward way to do this via the ROCR package? > > plot_roc_curve <- function(roc.dat, plt.title) { > #print(str(vowel.ROC)) > pred <- prediction(roc.dat$predictions, roc.dat$labels) > perf <- performance(pred, "tpr", "fpr") > perf.auc <- performance(pred, "auc") > perf.auc.areas <- slot(perf.auc, "y.values") > curve.area <- mean(unlist(perf.auc.areas)) > #quartz(width=4, height=6) > plot(perf, col="grey82", lty=3) > plot(perf,lwd=3,avg="horizontal",spread.estimate="boxplot", > add=T) > title(main=plt.title) > mtext(sprintf("%s%1.4f", "Area under Curve = ", curve.area), > side=3, line=0, cex=0.8) > } > > P.S. After years of studying statistical analysis as a student, I still > consider myself a novice. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Frank E Harrell Jr Professor and Chairman School of Medicine Department of Biostatistics Vanderbilt University