Skip to content

Hosmer Lemeshow test

3 messages · Endy BlackEndy, David L Carlson, Frank E Harrell Jr

#
The warning is commented out or it would have been printed:
   #   warning("Some expected counts are less than 5. Use smaller number of
groups")

For 23 data points, the default of 10 bins is too many since one of the bins
is 0. Eight bins gives the warning, but prints results. You didn't indicate
how many bins SPSS used. 

-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Endy BlackEndy
Sent: Tuesday, April 23, 2013 10:31 AM
To: r-help at r-project.org
Subject: [R] Hosmer Lemeshow test

Hi to everybody. I use the following routine (i found it in the internet) to
compute the Hosmer-Lemeshow test in the framework of logistic regression.

hosmerlemeshow = function(obj, g=10) {
# first, check to see if we fed in the right kind of object
   stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")
   y = obj$model[[1]]
   # the double bracket (above) gets the index of items within an object
   if (is.factor(y))
      y = as.numeric(y)==2
   yhat = obj$fitted.values
   cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)

   obs = xtabs(cbind(1 - y, y) ~ cutyhat)
   expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
   if (any(expect < 5))
   #   warning("Some expected counts are less than 5. Use smaller number of
groups")
   chisq = sum((obs - expect)^2/expect)
   P = 1 - pchisq(chisq, g - 2)
   cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")
   # by returning an object of class "htest", the function will perform like
the
   # built-in hypothesis tests
   return(structure(list(
      method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
"bins", sep=" ")),
      data.name = deparse(substitute(obj)),
      statistic = c(X2=chisq),
      parameter = c(df=g-2),
      p.value = P
   ), class='htest'))
   return(list(chisq=chisq,p.value=P))
}

However when i run it with the data set below i get the value NaN. Using the
same data set with SPSS i get a specific value.

FlightNo Temp ThermalDisast
1 66 0
2 70 1
3 69 0
4 68 0
5 67 0
6 72 0
7 73 0
8 70 0
9 57 1
10 63 1
11 70 1
12 78 0
13 67 0
14 53 1
15 67 0
16 75 0
17 70 0
18 81 0
19 76 0
20 79 0
21 75 1
22 76 0
23 58 1

Any suggestions will greatly appreciated.
With regards
Endy


______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
The H-L test is now considered to be obsolete by many.  One replacement is
the Hosmer - le Cessie test as implemented in the rms package residuals.lrm
function.  This is a one degree of freedom test.

One problem with H-L is its use of arbitrary binning and suboptimal power. 
The new test requires no binning at all.  Probably better still are directed
tests such as tests of linearity and additivity.
Frank

David Carlson wrote

            

            
-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-test-tp4665115p4665154.html
Sent from the R help mailing list archive at Nabble.com.