ROC curve from logistic regression

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080908/0ec54292/attachment.pl>
I know how to compute the ROC curve and the empirical AUC from the logistic
regression after fitting the model.

But here is my question, how can I compute the standard error for the AUC
estimator resulting form logistic regression? The variance should be more
complicated than AUC based on known test results. Does anybody know a
reference on this problem?

The rcorr.cens function in the Hmisc package will compute the std. error 
of Somers' Dxy rank correlation.  Dxy = 2*(C-.5) where C is the ROC 
area.  This standard error does not include a variance component for the 
uncertainty in the model (e.g., it does not penalize for the estimation 
of the regression coefficients if you are estimating the coefficients 
and assessing ROC area on the same sample).

The lrm function in the Design package fits binary and ordinal logistic 
regression models and reports C, Dxy, and other measures.

I haven't seen an example where drawing the ROC curve provides useful 
information that leads to correct actions.

Frank
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University
Hi

Try the following reference:

Comparison of Three Methods for Estimating the
Standard Error of the Area under the Curve in ROC
Analysis of Quantitative Data by Hajian-Tilaki and Hanley, Academic
Radiology, Vol 9, No 11, November 2002.

Below is a simple implementation that will return both the AUC and its
standard error (DeLong et al method). 

Hope this helps...

Pedro

#Input: yreal [-1,1]

auc <- function(yreal,forecasts){

	sizeT <-nrow(yreal)
	pos <- 0
	for(i in 1:sizeT){
		if(yreal[i]>0) {pos <- pos + 1}
	}
	neg <- sizeT-pos
	yrealpos <- vector(length=pos)
	yrealneg <- vector(length=neg)
	forepos  <- vector(length=pos)
	foreneg  <- vector(length=neg)

	controlpos <- 1
	controlneg <- 1
	for(i in 1:sizeT){
		if(yreal[i]>0) {
			yrealpos[controlpos] <- yreal[i]
			forepos[controlpos]  <- forecasts[i]
			controlpos <- controlpos + 1		
		} else {
			yrealneg[controlneg] <- yreal[i]
			foreneg[controlneg] <- forecasts[i]
			controlneg <- controlneg + 1
		}
	}
	oper <- 0
	for( i in 1:pos){
		for(j in 1:neg){
			if(forepos[i] > foreneg[j]) {oper <- oper + 1}
			if(forepos[i]==foreneg[j]) {oper <- oper + 0.50
} 
		}
	}

	area <- oper/(pos*neg)
	vpj <- vector(length=pos)
		vqk <- vector(length=neg)
		oper <- 0
		for(i in 1:pos){
			for(j in 1:neg){
				if(forepos[i] > foreneg[j]) {oper <-
oper + 1 
				} else {if(forepos[i]==foreneg[j]) {oper
<- oper + 0.50 }} 
			}
			division <- oper/neg
			resta <- (division-area)^2
			vpj[i] <- resta
			oper <- 0
		}
		oper  <- 0
		resta <- 0
		for(j in 1:neg){
			for(i in 1:pos){
				if(forepos[i] > foreneg[j]) {oper <-
oper + 1 
				} else {if(forepos[i]==foreneg[j]) {oper
<- oper + 0.50 }} 
			}
			division <- oper/pos
			resta <- (division-area)^2	
			vqk[j] <- resta
			oper   <- 0
		}
	vpj <- vpj/(pos*(pos-1))
	vqk <- vqk/(neg*(neg-1))
	var <- sum(vpj)+sum(vqk)
	s   <- sqrt(var)

	return(list(AUC=area, std=s))
}

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Frank E Harrell Jr
Sent: Monday, September 08, 2008 8:22 AM
To: gallon li
Cc: r-help
Subject: Re: [R] ROC curve from logistic regression
I know how to compute the ROC curve and the empirical AUC from the
logistic
regression after fitting the model.

But here is my question, how can I compute the standard error for the
AUC
estimator resulting form logistic regression? The variance should be
more
complicated than AUC based on known test results. Does anybody know a
reference on this problem?

The rcorr.cens function in the Hmisc package will compute the std. error

of Somers' Dxy rank correlation.  Dxy = 2*(C-.5) where C is the ROC 
area.  This standard error does not include a variance component for the

uncertainty in the model (e.g., it does not penalize for the estimation 
of the regression coefficients if you are estimating the coefficients 
and assessing ROC area on the same sample).

The lrm function in the Design package fits binary and ordinal logistic 
regression models and reports C, Dxy, and other measures.

I haven't seen an example where drawing the ROC curve provides useful 
information that leads to correct actions.

Frank
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt
University

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20080908/077bb619/attachment.pl>