Skip to content

Overdispersion with binomial distribution

4 messages · Jessica L Hite/hitejl/O/VCU, Brian Ripley, Ben Bolker

#
Jessica L Hite/hitejl/O/VCU <hitejl <at> vcu.edu> writes:
In principle, in the null case (i.e. data are really binomial)
the deviance is  chi-squared distributed with the df equal
to the residual df.

  For example:

example(glm)
deviance(glm.D93) ## 5.13
summary(glm.D93)$dispersion ## 1 (by definition)
dfr <- df.residual(glm.D93)
deviance(glm.D93)/dfr ## 1.28
d2 <- sum(residuals(glm.D93,"pearson")^2) ## 5.17
(disp2 <- d2/dfr)  ## 1.293

gg2 <- update(glm.D93,family=quasipoisson)
summary(gg2)$dispersion  ## 1.293, same as above

pchisq(d2,df=dfr,lower.tail=FALSE)

all.equal(coef(glm.D93),coef(gg2)) ## TRUE

se1 <- coef(summary(glm.D93))[,"Std. Error"]
se2 <- coef(summary(gg2))[,"Std. Error"]
se2/se1

# (Intercept)    outcome2    outcome3  treatment2  treatment3 
#   1.137234    1.137234    1.137234    1.137234    1.137234 

sqrt(disp2)
# [1] 1.137234
Way overdispersed may indicate model lack of fit.  Have
you examined residuals/data for outliers etc.?  

  quasibinomial should be fine, or you can try beta-binomial
(see the aod package) ...
That's as expected.
you don't really need MASS for quasibinomial.
#
On Tue, 17 Feb 2009, Ben Bolker wrote:

            
*Approximately*, provided the expected counts are not near or below 
one.  See MASS ?7.5 for an analysis of the size of the approximation 
errors (which can be large and in both directions).

Given that I once had a consulting job where the over-dispersion was 
causing something close ot panic and was entirely illusory, the lack 
of the 'approximately' can have quite serious consequences.

  
    
#
Thanks for the clarification.
  I actually had MASS open to that page while
I was composing my reply but forgot to mention
it (trying to do too many things at once) ...

  Ben Bolker
Prof Brian Ripley wrote: