Skip to content

Goodness of fit of binary logistic model

14 messages · Paul Smith, David Winsemius, Peter Dalgaard +1 more

#
Dear All,

I have just estimated this model:

-----------------------------------------------------------
Logistic Regression Model

lrm(formula = Y ~ X16, x = T, y = T)

                     Model Likelihood     Discrimination    Rank Discrim.
                        Ratio Test            Indexes          Indexes

Obs            82    LR chi2      5.58    R2       0.088    C       0.607
 0             46    d.f.            1    g        0.488    Dxy     0.215
 1             36    Pr(> chi2) 0.0182    gr       1.629    gamma   0.589
max |deriv| 9e-11                         gp       0.107    tau-a   0.107
                                          Brier    0.231


          Coef    S.E.   Wald Z Pr(>|Z|)
Intercept -1.3218 0.5627 -2.35  0.0188
X16=1      1.3535 0.6166  2.20  0.0282
-----------------------------------------------------------

Analyzing the goodness of fit:

-----------------------------------------------------------
Sum of squared errors     Expected value|H0                    SD
         1.890393e+01          1.890393e+01          6.073415e-16
                    Z                     P
        -8.638125e+04          0.000000e+00
-----------------------------------------------------------
this model. However, there is something that is puzzling me: If the
'Expected value|H0' is so coincidental with the 'Sum of squared
errors', why should one discard the model? I am certainly missing
something.

Thanks in advance,

Paul
#
On Aug 5, 2011, at 9:47 AM, Paul Smith wrote:

            
It's hard to tell what you are missing, since you have not described  
your reasoning at all. So I guess what is at error is your expectation  
that we would have drawn all of the unstated inferences that you draw  
when offered the output from lrm. (I certainly did not draw the  
inference that "one should discard the model".)

resid is a function designed for use with glm and lm models. Why  
aren't you  using residuals.lrm?
#
On Fri, Aug 5, 2011 at 4:54 PM, David Winsemius <dwinsemius at comcast.net> wrote:
----------------------------------------------------------
Sum of squared errors     Expected value|H0                    SD
         1.890393e+01          1.890393e+01          6.073415e-16
                    Z                     P
        -8.638125e+04          0.000000e+00
#
On Aug 5, 2011, at 12:21 PM, Paul Smith wrote:

            
Great. Now please answer the more fundamental question. Why do you  
think this mean "discard the model"?

David Winsemius, MD
West Hartford, CT
#
On Fri, Aug 5, 2011 at 5:35 PM, David Winsemius <dwinsemius at comcast.net> wrote:
Before answering that, let me tell you

resid(model.lrm,'gof')

calls residuals.lrm() -- so both approaches produce the same results.
(See the examples given by ?residuals.lrm)

To answer your question, I invoke the reasoning given by Frank Harrell at:

http://r.789695.n4.nabble.com/Hosmer-Lemeshow-goodness-of-fit-td3508127.html

He writes:

?The test in the rms package's residuals.lrm function is the le Cessie
- van Houwelingen - Copas - Hosmer unweighted sum of squares test for
global goodness of fit.  Like all statistical tests, a large P-value
has no information other than there was not sufficient evidence to
reject the null hypothesis.  Here the null hypothesis is that the true
probabilities are those specified by the model. ?
one should reject the null hypothesis? Please, correct if it is not
correct what I say, and please direct me towards a way of establishing
the goodness of fit of my model.

Paul
#
On Aug 5, 2011, at 12:53 PM, Paul Smith wrote:

            
How does that apply to your situation? You have a small (one might  
even say infinitesimal) p-value.
No, it doesn't follow at all, since that is not what he said. You are  
committing a common logical error. If A then B does _not_ imply If Not- 
A then Not-B.
You need to state your research objectives and describe the science in  
your domain. They you need to describe your data gathering methods and  
your analytic process. Then there might be a basis for further comment.
#
On Fri, Aug 5, 2011 at 7:07 PM, David Winsemius <dwinsemius at comcast.net> wrote:
I will try to read the original paper where this goodness of fit test
is proposed to clarify my doubts. In any case, in the paper

@article{barnes2008model,
  title={A model to predict outcomes for endovascular aneurysm repair
using preoperative variables},
  author={Barnes, M. and Boult, M. and Maddern, G. and Fitridge, R.},
  journal={European Journal of Vascular and Endovascular Surgery},
  volume={35},
  number={5},
  pages={571--579},
  year={2008},
  publisher={Elsevier}
}

it is written:

?Table 5 lists the results of the global goodness of ?t test
for each outcome model using the le Cessie-van Houwe-
lingen-Copas-Hosmer unweighted sum of squares test.
In the table a ?good? ?t is indicated by large p-values
( p > 0.05). Lack of ?t is indicated by low p-values
( p < 0.05). All p-values indicate that the outcome models
have reasonable ?t, with the exception of the outcome
model for conversion to open repairs ( p ? 0.04). The
low p-value suggests a lack of ?t and it may be worth
re?ning the model for conversion to open repair.?

In short, according to these authors, low p-values seem to suggest lack of fit.

Paul
#
On Aug 5, 2011, at 2:29 PM, Paul Smith wrote:

            
David Winsemius, MD
West Hartford, CT
#
On Aug 5, 2011, at 2:29 PM, Paul Smith wrote:

            
Sorry for the blank message.

So the topic is outcomes from surgery surgery? The gof approach to  
model assessment is just one way of looking at model comparison. The  
real question is not "is this the right fit", but should rather be  
"have I included as many relevant variables (for which I have data) as  
I need to". You included exactly one variable. That would imply that  
you had no prior knowledge about predictors of outcomes from surgery.  
On the the face of it seems highly implausible. Why are you even  
contemplating a gof test in such a situation? Notice that those  
authors said the effort should be made to "refine the model", not that  
it "should be discarded".



David Winsemius, MD
West Hartford, CT
#
Please provide the data or better the R code for simulating the data that
shows the problem.  Then we can look further into this.
Frank

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Goodness-of-fit-of-binary-logistic-model-tp3721242p3721997.html
Sent from the R help mailing list archive at Nabble.com.
#
Thanks, Frank. The following piece of code generate data, which
exhibit the problem I reported:

-----------------------------------------
set.seed(123)
intercept = -1.32
beta = 1.36
xtest = rbinom(1000,1,0.5)
linpred = intercept + xtest*beta
prob = exp(linpred)/(1 + exp(linpred))
runis = runif(1000,0,1)
ytest = ifelse(runis < prob,1,0)
xtest <- as.factor(xtest)
ytest <- as.factor(ytest)
require(rms)
model <- lrm(ytest ~ xtest,x=T,y=T)
model
residuals.lrm(model,'gof')
-----------------------------------------

Paul
On Fri, Aug 5, 2011 at 7:58 PM, Frank Harrell <f.harrell at vanderbilt.edu> wrote:
#
On Aug 5, 2011, at 23:16 , Paul Smith wrote:

            
Basically, what you have is zero divided by zero, except that floating point inaccuracy turns it into the ratio of two small numbers. So the Z statistic is effectively rubbish.
This comes about because the SSE minus its expectation has effectively zero variance, which makes it rather useless for testing whether the model fits.

Since the model is basically a full model for a 2x2 table, it is not surprising to me that "goodness of fit" tests behave poorly. In fact, I would conjecture that no sensible g.o.f. test exists for that case.

  
    
#
Exactly right Peter.  Thanks.

There should be some way for me to detect such situations so as to not
result in an impressive P-value.  Ideas welcomed!

This is a great example why users should post a toy example on the first
posting, as we can immediately see that this model MUST fit the data, so 
that any evidence for lack of fit has to be misleading.

Frank
Peter Dalgaard-2 wrote:
-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: http://r.789695.n4.nabble.com/Goodness-of-fit-of-binary-logistic-model-tp3721242p3723388.html
Sent from the R help mailing list archive at Nabble.com.
#
Thanks, Frank. As a rule, I provide an example in my first post.
However, in this case, the data are confidential, and I was not
allowed to provide you those data. Moreover, I thought that I was not
able to generate data exhibiting the reported problem.

Paul
On Sat, Aug 6, 2011 at 3:27 PM, Frank Harrell <f.harrell at vanderbilt.edu> wrote: