Hi all, I would like to calculate the area under the ROC curve for my predictive model. I have managed to plot points giving me the ROC curve. However, I do not know how to get the value of the area under. Does anybody know of a function that would give the result I want using an array of specificity and an array of sensitivity as input? Thanks, Olivier -- View this message in context: http://www.nabble.com/How-to-calculate-the-area-under-the-curve-tp26010501p26010501.html Sent from the R help mailing list archive at Nabble.com.
> -----Original Message----- > From: r-help-bounces at r-project.org > [mailto:r-help-bounces at r-project.org] On Behalf Of olivier.abz > Sent: Thursday, October 22, 2009 7:23 AM > To: r-help at r-project.org > Subject: [R] How to calculate the area under the curve > > > Hi all, > > I would like to calculate the area under the ROC curve for my > predictive > model. I have managed to plot points giving me the ROC curve.If x and y are the coordinates of the edges of the polygon (e.g., polygon(x,y) would draw the polygon) then the polygon's signed area is given by the following area<-function(x,y)sum(x*c(y[-1],y[1]) - c(x[-1],x[1])*y)/2 (Positive if edge is traced counter-clockwise.) This is a discrete version of Green's theorem. Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> However, I do > not know how to get the value of the area under. > Does anybody know of a function that would give the result I > want using an > array of specificity and an array of sensitivity as input? > > Thanks, > > Olivier > -- > View this message in context: > http://www.nabble.com/How-to-calculate-the-area-under-the-curve-tp26010501p26010501.html> Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
olivier.abz wrote:> Hi all, > > I would like to calculate the area under the ROC curve for my predictive > model. I have managed to plot points giving me the ROC curve. However, I do > not know how to get the value of the area under. > Does anybody know of a function that would give the result I want using an > array of specificity and an array of sensitivity as input? > > Thanks, > > OlivierOlivier, The ROC curves in my view just get in the way. They are mainly useful in that, almost by accident, the area under the curve equals a nice pure discrimination index. Go for the direct calculation of the ROC area based on the Wilcoxon-Mann-Whitney-Somers' Dxy rank correlation approach, e.g., using the Hmisc package rcorr.cens package which provides Dxy = 2(C-.5) where C = ROC area. It also provides the S.E. of Dxy and thus of C, and generalizes to censored data. This approach uses the raw data, not sensitivity and specificity (which are improper scoring rules). This is assuming you are using an external validation dataset. If this is not the case you will need to use the bootstrap or intensive cross-validation, e.g., using the rms package's lrm and validate functions. Also note that it is not usually appropriate to compare two ROC areas for choosing a model as this is too insensitive. It is the same as taking the difference between two scaled Wilcoxon statistics which is simply not done. Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University
Well, you can use the trapezoidal rule to numerically calculate any area under the curve. I don't know if a specific exists but you could create one. The principle is basically to compute the area between two successive points of your profile with: AREA=0.5*(Response1 + Response2)/(Time2-Time1) where time1 and time2 are the time of response1 and response2. You will finally add all the areas together to obtain the total area between your first and last point. HIH Hi all, I would like to calculate the area under the ROC curve for my predictive model. I have managed to plot points giving me the ROC curve. However, I do not know how to get the value of the area under. Does anybody know of a function that would give the result I want using an array of specificity and an array of sensitivity as input? Thanks, Olivier
I assume that you have an ordered pair (x, y) data, where x = sensitivity, and y = 1 - specificity. Your `x' values may or may not be equally spaced. Here is how you could solve your problem. I show this with an example where we can compute the area-under the curve exactly: # Area under the curve # # Trapezoidal rule # x values need not be equally spaced # trapezoid <- function(x,y) sum(diff(x)*(y[-1]+y[-length(y)]))/2 # # # Simpson's rule when `n' is odd # Composite Simpson and Trapezoidal rules when `n' is even # x values must be equally spaced # simpson <- function(x, y){ n <- length(y) odd <- n %% 2 if (odd) area <- 1/3*sum( y[1] + 2*sum(y[seq(3,(n-2),by=2)]) + 4*sum(y[seq(2,(n-1),by=2)]) + y[n]) if (!odd) area <- 1/3*sum( y[1] + 2*sum(y[seq(3,(n-3),by=2)]) + 4*sum(y[seq(2,(n-2),by=2)]) + y[n-1]) + 1/2*(y[n-1] + y[n]) dx <- x[2] - x[1] return(area * dx) } # # An example for AUC calculation x <- seq(0, 1, length=21) roc <- function(x, a) x + a * x * (1 - x) plot(x, roc(x, a=0.5), type="l") lines(x, roc(x, a=0.8), col=2) lines(x, roc(x, a=1.2), col=3) abline(b=1, lty=2) y <- roc(x, a=1) trapezoid(x, y) # exact answer is 2/3 simpson(x, y) # exact answer is 2/3 As you can see the Simpson's rule is more accurate, but the difference should not matter in applications, as long as you have sufficient number of points for sensitivity and specificity. Also, note that the improved accuracy of Simpson's rule is more fully realized when there are "odd" number of `x' values. If the number of points is even, the trapezoidal rule at the end point degrades the accuracy of Simpson approximation. Hope this helps, Ravi. ____________________________________________________________________ Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvaradhan at jhmi.edu ----- Original Message ----- From: "olivier.abz" <0509785 at rgu.ac.uk> Date: Thursday, October 22, 2009 10:24 am Subject: [R] How to calculate the area under the curve To: r-help at r-project.org> Hi all, > > I would like to calculate the area under the ROC curve for my predictive > model. I have managed to plot points giving me the ROC curve. However, > I do > not know how to get the value of the area under. > Does anybody know of a function that would give the result I want > using an > array of specificity and an array of sensitivity as input? > > Thanks, > > Olivier > -- > View this message in context: > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > > PLEASE do read the posting guide > and provide commented, minimal, self-contained, reproducible code.
See package ROCR. Then see ?performance; in the details, it describes a measure of auc. Tom Fletcher -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of olivier.abz Sent: Thursday, October 22, 2009 9:23 AM To: r-help at r-project.org Subject: [R] How to calculate the area under the curve Hi all, I would like to calculate the area under the ROC curve for my predictive model. I have managed to plot points giving me the ROC curve. However, I do not know how to get the value of the area under. Does anybody know of a function that would give the result I want using an array of specificity and an array of sensitivity as input? Thanks, Olivier -- View this message in context: http://www.nabble.com/How-to-calculate-the-area-under-the-curve-tp260105 01p26010501.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.