Christine SINOQUET
2011-Mar-13 15:31 UTC
[R] use of ROCR package (ROC curve / AUC value) in a specific case versus integral calculation
Hello, I would like to use the ROCR package to draw ROC curves and compute AUC values. However, in the specific context of my application, the true positive rates and false positive rates are already provided by some upstream method. Of course, I can draw a ROC plot with the following command : plot(x=FPrate, y=TPrate, "o", xlab="false positive rate", ylab="true positive rate", xlim=c(0, 1), ylim=c(0, 1) but this will bot compute the AUC value. There are two possibilities : Either it is possible to use the above parameters - FPrate and TPrate vectors- to run the performance function and I would like to know how, or it is not possible and I have to compute the area under the curve but I cannot find on the Web how to perform this, through an R package, using the two vectors above, if possible (I would rather not implement an integration algorithm). I thank you in advance for your answer. Best regards, Christine Sinoquet
Jose-Marcio Martins da Cruz
2011-Mar-13 20:48 UTC
[R] use of ROCR package (ROC curve / AUC value) in a specific case versus integral calculation
Christine SINOQUET wrote:> Hello, > > I would like to use the ROCR package to draw ROC curves and compute AUC > values. > > However, in the specific context of my application, the true positive > rates and false positive rates are already provided by some upstream > method. > > Of course, I can draw a ROC plot with the following command : > > plot(x=FPrate, y=TPrate, "o", xlab="false positive rate", ylab="true > positive rate", xlim=c(0, 1), ylim=c(0, 1) > > but this will bot compute the AUC value. > > There are two possibilities : > Either it is possible to use the above parameters - FPrate and TPrate > vectors- to run the performance function and I would like to know how, > > > or it is not possible and I have to compute the area under the curve but > I cannot find on the Web how to perform this, through an R package, > using the two vectors above, if possible (I would rather not implement > an integration algorithm). > > I thank you in advance for your answer.My answer doesn't use R... For some applications, when the classifier performance is quite good, integrating the ROC curve isn't a good idea, as the curve is very abrupt. Another approach is to use the fact that the (1-AUC) represents the probability of taking at random one positive and one negative event and don't geting them in the correct order (by their score). This is related to the Wilcoxon distribution. So, if you have a set of negative and positive events and the "scores" assigned to them, you can sort them by their scores and enumerate the number of couples out of order you can get. Hope this help. Jos?-Marcio -- --------------------------------------------------------------- Jose Marcio MARTINS DA CRUZ http://j-chkmail.ensmp.fr Ecole des Mines de Paris 60, bd Saint Michel 75272 - PARIS CEDEX 06 mailto:Jose-Marcio.Martins at mines-paristech.fr