overdispersion with binomial data?

1) Different types of residuals serve different purposes.

2) I am of the school that thinks it misguided to use the
results of a test for overdispersion to decide whether to
model it.  If there is any reason to suspect over-dispersion
(and in many/most ecological applications there is), this 
is anti-conservative.  I judge this a misuse of statistical
testing.  While, some do rely on the result of a test in these
circumstances, I have never seen a credible defence of 
this practice.

3) In fitting a quasi model using glm(), McCullagh and Nelder
(which I do not have handy at the moment) argue, if I recall
correctly, for use of the Pearson chi-square estimate.  The
mean deviance is unduly susceptible to bias.

4) Whereas the scale factor (sqrt dispersion estimate) is incorporated
into the GLM residuals, the residuals from glmer() exclude all
random effects except that due to poisson variation.  The residuals
are what remains after accounting for all fixed and random effects,
including observation level random effects.

5) Your mdf divisor is too small.  Your stream, stream:rip and ID
random terms account for further 'degrees of freedom'.  Maybe
degrees of freedom are not well defined in this context?  Anyone
care to comment?  The size of this quantity cannot, in any case, be
used to indicate over-fitting or under-fitting.  The model assumes
a theoretical value of 1.  Apart from bias in the estimate, the 
residuals are constrained by the model to have magnitudes that
are consistent with this theoretical value.

6) If you fit a non-quasi error (binomial or poisson) in a glm model, 
the summary output has a column labeled "z value".  If you fit a quasi 
error, the corresponding column is labeled "t value".  In the glmer 
output, the label 'z value' is in my view almost always inappropriate.
To the extent that the description carries across, it is the counterpart 
of the "t value" column in the glm output with the quasi error term.
(Actually, in the case where the denominator is entirely composed
from the theoretical variance, Z values that are as near as maybe
identical can almost always [always?] be derived using an 
appropriate glm model with a non-quasi error term.)

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

overdispersion with binomial data?

Thread (8 messages)