Laurent Fanchon
2006-Mar-15 16:57 UTC
[R] How to compare areas under ROC curves calculated with ROCR package
Dear all, I try to compare the performances of several parameters to diagnose lameness in dogs. I have several ROC curves from the same dataset. I plotted the ROC curves and calculated AUC with the ROCR package. I would like to compare the AUC. I used the following program I found on R-help archives : From: Bernardo Rangel Tura Date: Thu 16 Dec 2004 - 07:30:37 EST seROC<-function(AUC,na,nn){ a<-AUC q1<-a/(2-a) q2<-(2*a^2)/(1+a) se<-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) se } cROC<-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ se1<-seROC(AUC1,na1,nn1) se2<-seROC(AUC2,na2,nn2) sed<-sqrt(se1^2+se2^2-2*r*se1*se2) zad<-(AUC1-AUC2)/sed p<-dnorm(zad) a<-list(zad,p) a } The author of this script says: "The first function (seROC) calculate the standard error of ROC curve, the second function (cROC) compare ROC curves." What do you think of this script? Is there any function to do it better in ROCR? Any help would be greatly appreciated. Laurent Fanchon DVM, MS Ecole Nationale V?t?rinaire d'Alfort FRANCE
Frank Samuelson
2006-Mar-23 14:54 UTC
[R] How to compare areas under ROC curves calculated with ROCR package
The seROC routine you included is an very good approximation to the standard error of the Mann-Whitney-Wilcoxon/Area under the ROC curve statistic. It is derived from negative exponential models, but works very well in general (e.g. Hanley and McNeil, Diagnostic Radiology, 1982, v. 143, p. 29). A more general estimator of the variance is given by Campbell, Douglas and Bailey, Proc. Computers in Cardiology, 1988, p.267) I've implemented that in R code included below. It is not an unbiased estimator, but it is very close. The cROC function is probably not what you want, however. It assumes that the data from the two different area measures are independent. You said your measures are "from the same dataset." Your different AUC measures will be highly correlated. There are a number of methods to deal with correlated ROC curves in existence. If you are interested in performing hypothesis testing on the difference in AUC of two parameters, I would suggest a permutation test. Permuting the ranks of the data between parameters is simple and works well. -Frank ################################################################## AuROC<-function(neg,pos) { #empirical Area under ROC/ Wilcoxon-Mann-.... stat. # Also calculate the empirical variance thereof. Goes as O(n*log(n)). nx<-length(neg); ny<-length(pos); nall<-nx+ny; rankall<-rank(c(neg,pos)) # rank of all samples with respect to one another. rankpos<-rankall[(nx+1):nall]; # ranks of the positive cases ranksum <-sum(rankpos)-ny*(ny+1)/2 #sum of ranks of positives among negs. ranky<-rank(pos); ## ranks of just the y's (positives) among themselves rankyx<-rankpos-ranky # ranks of the y's among the x's (negatives) p21<-sum(rankyx*rankyx-rankyx)/nx/(nx-1)/ny; #term in variance rankx<-rank(neg); ## ranks of x's (negatives) among each other ## reverse ranks of x's with respect to y's. rankxy<- ny- rankall[1:nx]+ rankx ; p12<- sum(rankxy*rankxy-rankxy)/nx/ny/(ny-1); #another variance term a<-ranksum/ny/nx; # the empirical area v<-(a*(1-a)+(ny-1)*(p12-a*a) + (nx-1)*(p21-a*a))/nx/ny; c(a,v); # return vector containing Mann-Whitney stat and the variance. } #################################################### Laurent Fanchon wrote:> Dear all, > > I try to compare the performances of several parameters to diagnose > lameness in dogs. > I have several ROC curves from the same dataset. > I plotted the ROC curves and calculated AUC with the ROCR package. > > I would like to compare the AUC. > I used the following program I found on R-help archives : > > From: Bernardo Rangel Tura > Date: Thu 16 Dec 2004 - 07:30:37 EST > > seROC<-function(AUC,na,nn){ > a<-AUC > q1<-a/(2-a) > q2<-(2*a^2)/(1+a) > se<-sqrt((a*(1-a)+(na-1)*(q1-a^2)+(nn-1)*(q2-a^2))/(nn*na)) > se > } > > cROC<-function(AUC1,na1,nn1,AUC2,na2,nn2,r){ > se1<-seROC(AUC1,na1,nn1) > se2<-seROC(AUC2,na2,nn2) > > sed<-sqrt(se1^2+se2^2-2*r*se1*se2) > zad<-(AUC1-AUC2)/sed > p<-dnorm(zad) > a<-list(zad,p) > a > } > > The author of this script says: "The first function (seROC) calculate the standard error of ROC curve, the > second function (cROC) compare ROC curves." > > What do you think of this script? > Is there any function to do it better in ROCR? > > Any help would be greatly appreciated. > > Laurent Fanchon > DVM, MS > Ecole Nationale V?t?rinaire d'Alfort > FRANCE > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >