Under dispersion; Was: [R] binomial glm warnings revisited
Tord Snall <tord.snall at ebc.uu.se> writes:
Null deviance: 13.1931 on 269 degrees of freedom Residual deviance: 9.9168 on 268 degrees of freedom AIC: 13.917
...
BUT, note the under dispersion. I GUESS it is because I have surveyed a moss on marked trees at three occations (with two years in between). The response 1 means that the moss has disappeared, and dbh is tree diameter. (This corresponds to revisitng patients who has a disease, and whose weight is unchanged between the visits. H0: weight does not affect tha chance of recovery from the disease)
Don't trust deviances as measures of dispersion with binary data!
Here is a version with quasibinomial:
...
Note, no warning. I guess that this quasibinomial model is more reliable than the binomial. Now I can trust the SE of the Estim. too, can't I?
No. Neither nor. With binary data, the deviance is purely a function of the fitted parameters. It is the difference in -2 log L between a "perfect fit" and the observed fit. A perfect fit has a zero prob. where the obs is "0" and probability 1 where it is "1", and L == 1 identically in that case. Now consider the likelihood for the "complete toss-up" i.e. intercept and slope both equal to 0 so all probabilities are 0.5. The likelihood in that case is 0.5^269, i.e. a constant. Take logarithms and notice that the model deviance plus the change in deviance from the model to the "toss-up" model is constant (2*269*log(2) to be precise). So what appears to be a measure of residual error is really just a measure of how far the fitted probabilities are from 0.5!
O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907