An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20130423/daf63e0e/attachment.pl>
Hosmer Lemeshow test
3 messages · Endy BlackEndy, David L Carlson, Frank E Harrell Jr
The warning is commented out or it would have been printed:
# warning("Some expected counts are less than 5. Use smaller number of
groups")
For 23 data points, the default of 10 bins is too many since one of the bins
is 0. Eight bins gives the warning, but prints results. You didn't indicate
how many bins SPSS used.
-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On
Behalf Of Endy BlackEndy
Sent: Tuesday, April 23, 2013 10:31 AM
To: r-help at r-project.org
Subject: [R] Hosmer Lemeshow test
Hi to everybody. I use the following routine (i found it in the internet) to
compute the Hosmer-Lemeshow test in the framework of logistic regression.
hosmerlemeshow = function(obj, g=10) {
# first, check to see if we fed in the right kind of object
stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")
y = obj$model[[1]]
# the double bracket (above) gets the index of items within an object
if (is.factor(y))
y = as.numeric(y)==2
yhat = obj$fitted.values
cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)
obs = xtabs(cbind(1 - y, y) ~ cutyhat)
expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
if (any(expect < 5))
# warning("Some expected counts are less than 5. Use smaller number of
groups")
chisq = sum((obs - expect)^2/expect)
P = 1 - pchisq(chisq, g - 2)
cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")
# by returning an object of class "htest", the function will perform like
the
# built-in hypothesis tests
return(structure(list(
method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
"bins", sep=" ")),
data.name = deparse(substitute(obj)),
statistic = c(X2=chisq),
parameter = c(df=g-2),
p.value = P
), class='htest'))
return(list(chisq=chisq,p.value=P))
}
However when i run it with the data set below i get the value NaN. Using the
same data set with SPSS i get a specific value.
FlightNo Temp ThermalDisast
1 66 0
2 70 1
3 69 0
4 68 0
5 67 0
6 72 0
7 73 0
8 70 0
9 57 1
10 63 1
11 70 1
12 78 0
13 67 0
14 53 1
15 67 0
16 75 0
17 70 0
18 81 0
19 76 0
20 79 0
21 75 1
22 76 0
23 58 1
Any suggestions will greatly appreciated.
With regards
Endy
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
The H-L test is now considered to be obsolete by many. One replacement is the Hosmer - le Cessie test as implemented in the rms package residuals.lrm function. This is a one degree of freedom test. One problem with H-L is its use of arbitrary binning and suboptimal power. The new test requires no binning at all. Probably better still are directed tests such as tests of linearity and additivity. Frank David Carlson wrote
The warning is commented out or it would have been printed:
# warning("Some expected counts are less than 5. Use smaller number
of
groups")
For 23 data points, the default of 10 bins is too many since one of the
bins
is 0. Eight bins gives the warning, but prints results. You didn't
indicate
how many bins SPSS used.
-------------------------------------
David L Carlson
Associate Professor of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From:
r-help-bounces@
[mailto:
r-help-bounces@
] On Behalf Of Endy BlackEndy Sent: Tuesday, April 23, 2013 10:31 AM To:
r-help@
Subject: [R] Hosmer Lemeshow test
Hi to everybody. I use the following routine (i found it in the internet)
to
compute the Hosmer-Lemeshow test in the framework of logistic regression.
hosmerlemeshow = function(obj, g=10) {
# first, check to see if we fed in the right kind of object
stopifnot(family(obj)$family=="binomial" && family(obj)$link=="logit")
y = obj$model[[1]]
# the double bracket (above) gets the index of items within an object
if (is.factor(y))
y = as.numeric(y)==2
yhat = obj$fitted.values
cutyhat=cut(yhat,quantile(yhat,0:g/g),include.lowest=TRUE)
obs = xtabs(cbind(1 - y, y) ~ cutyhat)
expect = xtabs(cbind(1 - yhat, yhat) ~ cutyhat)
if (any(expect < 5))
# warning("Some expected counts are less than 5. Use smaller number
of
groups")
chisq = sum((obs - expect)^2/expect)
P = 1 - pchisq(chisq, g - 2)
cat("H-L2 statistic",round(chisq,4)," and its p value",P,"\n")
# by returning an object of class "htest", the function will perform
like
the
# built-in hypothesis tests
return(structure(list(
method = c(paste("Hosmer and Lemeshow goodness-of-fit test with", g,
"bins", sep=" ")),
data.name = deparse(substitute(obj)),
statistic = c(X2=chisq),
parameter = c(df=g-2),
p.value = P
), class='htest'))
return(list(chisq=chisq,p.value=P))
}
However when i run it with the data set below i get the value NaN. Using
the
same data set with SPSS i get a specific value.
FlightNo Temp ThermalDisast
1 66 0
2 70 1
3 69 0
4 68 0
5 67 0
6 72 0
7 73 0
8 70 0
9 57 1
10 63 1
11 70 1
12 78 0
13 67 0
14 53 1
15 67 0
16 75 0
17 70 0
18 81 0
19 76 0
20 79 0
21 75 1
22 76 0
23 58 1
Any suggestions will greatly appreciated.
With regards
Endy
[[alternative HTML version deleted]]
______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________
R-help@
mailing list
https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hosmer-Lemeshow-test-tp4665115p4665154.html Sent from the R help mailing list archive at Nabble.com.