Skip to content
Prev 197634 / 398500 Next

How to calculate the area under the curve

I assume that you have an ordered pair (x, y) data, where x = sensitivity, and y = 1 - specificity.  Your `x' values may or may not be equally spaced.  Here is how you could solve your problem.  I show this with an example where we can compute the area-under the curve exactly:

# Area under the curve
#
# Trapezoidal rule
# x values need not be equally spaced
#
trapezoid <- function(x,y) sum(diff(x)*(y[-1]+y[-length(y)]))/2
#
#
# Simpson's rule when `n' is odd
# Composite Simpson and Trapezoidal rules when `n' is even
# x values must be equally spaced
#
simpson <- function(x, y){
n <- length(y)
odd <- n %% 2
if (odd) area <- 1/3*sum( y[1] + 2*sum(y[seq(3,(n-2),by=2)]) + 
	4*sum(y[seq(2,(n-1),by=2)]) + y[n])

if (!odd) area <- 1/3*sum( y[1] + 2*sum(y[seq(3,(n-3),by=2)]) + 
	4*sum(y[seq(2,(n-2),by=2)]) + y[n-1]) + 1/2*(y[n-1] + y[n])

dx <- x[2] - x[1]
return(area * dx)
}
#
# An example for AUC calculation
x <- seq(0, 1, length=21)

roc <- function(x, a) x + a * x * (1 - x)

plot(x, roc(x, a=0.5), type="l")
lines(x, roc(x, a=0.8), col=2)
lines(x, roc(x, a=1.2), col=3)
abline(b=1, lty=2)

y <- roc(x, a=1)

trapezoid(x, y)  # exact answer is 2/3

simpson(x, y) # exact answer is 2/3

As you can see the Simpson's rule is more accurate, but the difference should not matter in applications, as long as you have sufficient number of points for sensitivity and specificity.  Also, note that the improved accuracy of Simpson's rule is more fully realized when there are "odd" number of `x' values.  If the number of points is even, the trapezoidal rule at the end point degrades the accuracy of Simpson approximation.

Hope this helps,
Ravi.

____________________________________________________________________

Ravi Varadhan, Ph.D.
Assistant Professor,
Division of Geriatric Medicine and Gerontology
School of Medicine
Johns Hopkins University

Ph. (410) 502-2619
email: rvaradhan at jhmi.edu


----- Original Message -----
From: "olivier.abz" <0509785 at rgu.ac.uk>
Date: Thursday, October 22, 2009 10:24 am
Subject: [R]  How to calculate the area under the curve
To: r-help at r-project.org