Skip to content

scale factors/overdispersion in GLM: possible bug?

1 message · Brian Ripley

#
A belated reply: I have been away and then busy (including on this).
A bug.  Using the appropriate tail for accuracy is irrelevant, though.
I've worked over all these for 1.1.0.  Be careful in how much sameness
to expect, though. drop1 has an argument called 'scale', and is
generic.  So drop1.glm must have the same name even though `dispersion'
might be more appropriate.
Over-dispersion is discussed at length in

Collett, D. (1991) Modelling Binary Data

and somewhat less discursively in

Cox & Snell (1989) Analysing Binary Data
McCullagh & Nelder (1989, section 4.5)
Aitkin, Anderson, Francis, Hinde (1983) Statistical Modelling in GLIM

amongst others.  Almost all of this is binomial models, but Aitkin et al
do consider Poisson.

Basically, all find some justification for a model with variance phi >=
1 times that given by a binomial or poisson family. For X_i ~
binomial(n, p_i), n > 1 not depending on i, this is known as the
Williams model in some circles.  So people since Williams (and I seem
to remember before) have been fitting such models and estimating phi by
residual deviance/residual df.  That is a quasi-likelihood procedure,
as even where real models exist, the glm is not giving the MLE.

To make sense of this I have introduced two new families, quasibinomial
and quasipoisson, that use the Williams model, and now 
binomial and Poisson never allow phi to be estimated.  And for the Williams
model there is an "F" test option to add1 and drop1 (and if you
use "F" with a binomial or poisson it tells you it is really using
the quasi-version).   This applies to summary.glm, predict.glm, anova.glm,
add1.glm, drop1.glm.  (As quasi models have no likelihood, they
have no AIC either, and so step will not work for them.)