Skip to content

glm with binomial errors - problem with overdispersion

6 messages · Anna Mill, Brian Ripley, Peter Dalgaard

#
I presume you intended 'type' and 'fragment' to be factors (see 
below).  Such a model would fit exactly.  The additive model
is only modestly over-dispersed, and shows that 'fragment' has zero 
effect.  Not 'a negligible effect', but no effect.  So something 
really odd is going on: is this an exercise with artificial data?
Otherwise you need to explain the exact balance between the two 
'fragments' (each fragment has exactly 1/4 success) and your 
assumption of independent binomial sampling cannot be true.

Using a quasibinomial model does not change the deviance (see e.g. 
McCullagh and Nelder for the definitions, including of 'scaled 
deviance')), but it does change the standard errors.
On Mon, 13 Jun 2011, Anna Mill wrote:

            
You have types and fragments but no species and no sites.  At least 
'sites' should be a factor, as should 'categories of seed sizes'.

  
    
#
On Jun 14, 2011, at 08:13 , Prof Brian Ripley wrote:

            
Also note that success+failure is exactly 102 in fragment 1 and 105 in fragment 2, as is the sum of the successes for each fragment (of course it has to to make exactly 1/4). It is rather easy to suspect that it is actually a 0/1 coding of the type (as in "tick exactly one box"), and not independent binomial data.
#
On Jun 14, 2011, at 09:53 , Anna Mill wrote:

            
Well, it's your data, and only you can tell what the original data looks like. We can only _suspect_ that they might be generated to be mutually exclusive. 

If you do not have independent binomial data, then a glm(..., binomial) will be seriously inappropriate (and a simple chi-square on the table of "successes" by type and fragment will be the obvious thing to do).