An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080908/0ec54292/attachment.pl>
ROC curve from logistic regression
4 messages · gallon li, Frank E Harrell Jr, Pedro.Rodriguez at sungard.com +1 more
gallon li wrote:
I know how to compute the ROC curve and the empirical AUC from the logistic regression after fitting the model. But here is my question, how can I compute the standard error for the AUC estimator resulting form logistic regression? The variance should be more complicated than AUC based on known test results. Does anybody know a reference on this problem?
The rcorr.cens function in the Hmisc package will compute the std. error of Somers' Dxy rank correlation. Dxy = 2*(C-.5) where C is the ROC area. This standard error does not include a variance component for the uncertainty in the model (e.g., it does not penalize for the estimation of the regression coefficients if you are estimating the coefficients and assessing ROC area on the same sample). The lrm function in the Design package fits binary and ordinal logistic regression models and reports C, Dxy, and other measures. I haven't seen an example where drawing the ROC curve provides useful information that leads to correct actions. Frank
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt University
Hi
Try the following reference:
Comparison of Three Methods for Estimating the
Standard Error of the Area under the Curve in ROC
Analysis of Quantitative Data by Hajian-Tilaki and Hanley, Academic
Radiology, Vol 9, No 11, November 2002.
Below is a simple implementation that will return both the AUC and its
standard error (DeLong et al method).
Hope this helps...
Pedro
#Input: yreal [-1,1]
auc <- function(yreal,forecasts){
sizeT <-nrow(yreal)
pos <- 0
for(i in 1:sizeT){
if(yreal[i]>0) {pos <- pos + 1}
}
neg <- sizeT-pos
yrealpos <- vector(length=pos)
yrealneg <- vector(length=neg)
forepos <- vector(length=pos)
foreneg <- vector(length=neg)
controlpos <- 1
controlneg <- 1
for(i in 1:sizeT){
if(yreal[i]>0) {
yrealpos[controlpos] <- yreal[i]
forepos[controlpos] <- forecasts[i]
controlpos <- controlpos + 1
} else {
yrealneg[controlneg] <- yreal[i]
foreneg[controlneg] <- forecasts[i]
controlneg <- controlneg + 1
}
}
oper <- 0
for( i in 1:pos){
for(j in 1:neg){
if(forepos[i] > foreneg[j]) {oper <- oper + 1}
if(forepos[i]==foreneg[j]) {oper <- oper + 0.50
}
}
}
area <- oper/(pos*neg)
vpj <- vector(length=pos)
vqk <- vector(length=neg)
oper <- 0
for(i in 1:pos){
for(j in 1:neg){
if(forepos[i] > foreneg[j]) {oper <-
oper + 1
} else {if(forepos[i]==foreneg[j]) {oper
<- oper + 0.50 }}
}
division <- oper/neg
resta <- (division-area)^2
vpj[i] <- resta
oper <- 0
}
oper <- 0
resta <- 0
for(j in 1:neg){
for(i in 1:pos){
if(forepos[i] > foreneg[j]) {oper <-
oper + 1
} else {if(forepos[i]==foreneg[j]) {oper
<- oper + 0.50 }}
}
division <- oper/pos
resta <- (division-area)^2
vqk[j] <- resta
oper <- 0
}
vpj <- vpj/(pos*(pos-1))
vqk <- vqk/(neg*(neg-1))
var <- sum(vpj)+sum(vqk)
s <- sqrt(var)
return(list(AUC=area, std=s))
}
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Frank E Harrell Jr
Sent: Monday, September 08, 2008 8:22 AM
To: gallon li
Cc: r-help
Subject: Re: [R] ROC curve from logistic regression
gallon li wrote:
I know how to compute the ROC curve and the empirical AUC from the
logistic
regression after fitting the model. But here is my question, how can I compute the standard error for the
AUC
estimator resulting form logistic regression? The variance should be
more
complicated than AUC based on known test results. Does anybody know a reference on this problem?
The rcorr.cens function in the Hmisc package will compute the std. error of Somers' Dxy rank correlation. Dxy = 2*(C-.5) where C is the ROC area. This standard error does not include a variance component for the uncertainty in the model (e.g., it does not penalize for the estimation of the regression coefficients if you are estimating the coefficients and assessing ROC area on the same sample). The lrm function in the Design package fits binary and ordinal logistic regression models and reports C, Dxy, and other measures. I haven't seen an example where drawing the ROC curve provides useful information that leads to correct actions. Frank
Frank E Harrell Jr Professor and Chair School of Medicine
Department of Biostatistics Vanderbilt
University
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080908/077bb619/attachment.pl>