thr3ads.net - R help - [R] Standard error for the area under a smoothed ROC curve? [Jan 2005]

If this information is useful, please help other people find it:
Share via:

Dan Bolser

2005-Jan-11 16:50 UTC

[R] Standard error for the area under a smoothed ROC curve?

Hello, 

I am making some use of ROC curve analysis. 

I find much help on the mailing list, and I have used the Area Under the
Curve (AUC) functions from the ROC function in the bioconductor project...

http://www.bioconductor.org/repository/release1.5/package/Source/
ROC_1.0.13.tar.gz 

However, I read here...

http://www.medcalc.be/manual/mpage06-13b.php

"The 95% confidence interval for the area can be used to test the
hypothesis that the theoretical area is 0.5. If the confidence interval
does not include the 0.5 value, then there is evidence that the laboratory
test does have an ability to distinguish between the two groups (Hanley &
McNeil, 1982; Zweig & Campbell, 1993)."

But aside from early on the above article is short on details. Can anyone
tell me how to calculate the CI of the AUC calculation?


I read this...

http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf

Which talks about resampling (by showing R code), but I can't understand
what is going on, or what is calculated (the example given is specific to
microarray analysis I think).

I think a general AUC CI function would be a good addition to the ROC
package.




One more thing, in calculating the AUC I see the splines function is
recomended over the approx function. Here...

http://tolstoy.newcastle.edu.au/R/help/04/10/6138.html

How would I rewrite the following AUC functions (adapted from bioconductor
source) to use splines (or approxfun or splinefun) ...
> spe # Specificity [1] 0.02173913 0.13043478 0.21739130 0.32608696 0.43478261 0.54347826
 [7] 0.65217391 0.76086957 0.89130435 1.00000000 1.00000000 1.00000000
[13] 1.00000000
> sen # Sensitivity [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.9302326 0.8139535
 [8] 0.6976744 0.5581395 0.4418605 0.3488372 0.2325581 0.1162791

trapezint(1-spe,sen)
my.integrate(1-spe,sen)

## Functions
## Nicked (and modified) from the ROC function in bioconductor.
"trapezint" <-
function (x, y, a = 0, b = 1)
{
    if (x[1] > x[length(x)]) {
      x <- rev(x)
      y <- rev(y)
    }
    y <- y[x >= a & x <= b]
    x <- x[x >= a & x <= b]
    if (length(unique(x)) < 2)
        return(NA)
    ya <- approx(x, y, a, ties = max, rule = 2)$y
    yb <- approx(x, y, b, ties = max, rule = 2)$y
    x <- c(a, x, b)
    y <- c(ya, y, yb)
    h <- diff(x)
    lx <- length(x)
    0.5 * sum(h * (y[-1] + y[-lx]))
}

"my.integrate" <-
function (x, y, t0 = 1)
{
    f <- function(j) approx(x,y,j,rule=2,ties=max)$y
    integrate(f, 0, t0)$value
}





Thanks for any pointers,
Dan.

Frank E Harrell Jr

2005-Jan-12 13:18 UTC

head link

[R] Standard error for the area under a smoothed ROC curve?

Dan Bolser wrote:> Hello, 
> 
> I am making some use of ROC curve analysis. 
> 
> I find much help on the mailing list, and I have used the Area Under the
> Curve (AUC) functions from the ROC function in the bioconductor project...
> 
> http://www.bioconductor.org/repository/release1.5/package/Source/
> ROC_1.0.13.tar.gz 
> 
> However, I read here...
> 
> http://www.medcalc.be/manual/mpage06-13b.php
> 
> "The 95% confidence interval for the area can be used to test the
> hypothesis that the theoretical area is 0.5. If the confidence interval
> does not include the 0.5 value, then there is evidence that the laboratory
> test does have an ability to distinguish between the two groups (Hanley
&
> McNeil, 1982; Zweig & Campbell, 1993)."
> 
> But aside from early on the above article is short on details. Can anyone
> tell me how to calculate the CI of the AUC calculation?
> 
> 
> I read this...
> 
> http://www.bioconductor.org/repository/devel/vignette/ROCnotes.pdf
> 
> Which talks about resampling (by showing R code), but I can't
understand
> what is going on, or what is calculated (the example given is specific to
> microarray analysis I think).
> 
> I think a general AUC CI function would be a good addition to the ROC
> package.
> 
> 
> 
> 
> One more thing, in calculating the AUC I see the splines function is
> recomended over the approx function. Here...
> 
> http://tolstoy.newcastle.edu.au/R/help/04/10/6138.html
> 
> How would I rewrite the following AUC functions (adapted from bioconductor
> source) to use splines (or approxfun or splinefun) ...
> 
> 
>>spe # Specificity
> 
>  [1] 0.02173913 0.13043478 0.21739130 0.32608696 0.43478261 0.54347826
>  [7] 0.65217391 0.76086957 0.89130435 1.00000000 1.00000000 1.00000000
> [13] 1.00000000
> 
> 
>>sen # Sensitivity
> 
>  [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.9302326 0.8139535
>  [8] 0.6976744 0.5581395 0.4418605 0.3488372 0.2325581 0.1162791
> 
> trapezint(1-spe,sen)
> my.integrate(1-spe,sen)
> 
> ## Functions
> ## Nicked (and modified) from the ROC function in bioconductor.
> "trapezint" <-
> function (x, y, a = 0, b = 1)
> {
>     if (x[1] > x[length(x)]) {
>       x <- rev(x)
>       y <- rev(y)
>     }
>     y <- y[x >= a & x <= b]
>     x <- x[x >= a & x <= b]
>     if (length(unique(x)) < 2)
>         return(NA)
>     ya <- approx(x, y, a, ties = max, rule = 2)$y
>     yb <- approx(x, y, b, ties = max, rule = 2)$y
>     x <- c(a, x, b)
>     y <- c(ya, y, yb)
>     h <- diff(x)
>     lx <- length(x)
>     0.5 * sum(h * (y[-1] + y[-lx]))
> }
> 
> "my.integrate" <-
> function (x, y, t0 = 1)
> {
>     f <- function(j) approx(x,y,j,rule=2,ties=max)$y
>     integrate(f, 0, t0)$value
> }
> 
> 
> 
> 
> 
> Thanks for any pointers,
> Dan.
I don't see why the above formulas are being used.  The 
Bamber-Hanley-McNeil-Wilcoxon-Mann-Whitney nonparametric method works 
great.  Just get the U statistic (concordance probability) used in 
Wilcoxon.  As Somers' Dxy rank correlation coefficient is 2*(1-C) where 
C is the concordance or ROC area, the Hmisc package function rcorr.cens 
uses U statistic methods to get the standard error of Dxy.  You can 
easily translate this to a standard error of C.

Frank

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

Possibly Parallel Threads

Search for more seemingly similar threads

R help - Jan 2005 - Standard error for the area under a smoothed ROC curve?

[R] Standard error for the area under a smoothed ROC curve?

[R] Standard error for the area under a smoothed ROC curve?

Possibly Parallel Threads