An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110613/de22ede3/attachment.pl>
glm with binomial errors - problem with overdispersion
6 messages · Anna Mill, Brian Ripley, Peter Dalgaard
I presume you intended 'type' and 'fragment' to be factors (see below). Such a model would fit exactly. The additive model
model <- glm(y ~ fragment+type, binomial)
is only modestly over-dispersed, and shows that 'fragment' has zero effect. Not 'a negligible effect', but no effect. So something really odd is going on: is this an exercise with artificial data? Otherwise you need to explain the exact balance between the two 'fragments' (each fragment has exactly 1/4 success) and your assumption of independent binomial sampling cannot be true. Using a quasibinomial model does not change the deviance (see e.g. McCullagh and Nelder for the definitions, including of 'scaled deviance')), but it does change the standard errors.
On Mon, 13 Jun 2011, Anna Mill wrote:
Dear all, I am new to R and my question may be trivial to you... I am doing a GLM with binomial errors to compare proportions of species in different categories of seed sizes (4 categories) between 2 sites.
You have types and fragments but no species and no sites. At least 'sites' should be a factor, as should 'categories of seed sizes'.
In the model summary the residual deviance is much higher than the degree of freedom (Residual deviance: 153.74 on 4 degrees of freedom) and even after correcting for overdispersion by using a quasibinomial error structure instead of binomial the residual deviance does not change. Is this a data problem and I cannot use this statistic or is it because I do something wrong with R (see models attached)? Thanks a lot for your help! Anna first model with binomial error structure:
success<-c(14,43,44,1,13,28,56,8) failure<-c(88,59,58,101,92,77,49,97) "fragment"<-c(1,1,1,1,2,2,2,2) "type"<-c(1,2,3,4,1,2,3,4) y<-cbind(success,failure) model<-glm(y~fragment*type,binomial) summary(model)
Call:
glm(formula = y ~ fragment * type, family = binomial)
Deviance Residuals:
1 2 3 4 5 6 7 8
-4.0175 3.3716 4.5052 -6.0071 -2.8063 0.5449 6.0414 -5.0184
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.04433 0.61072 0.073 0.9421
fragment -0.65477 0.39001 -1.679 0.0932 .
type -0.46664 0.23027 -2.027 0.0427 *
fragment:type 0.26636 0.14455 1.843 0.0654 .
---
Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 157.96 on 7 degrees of freedom
Residual deviance: 153.74 on 4 degrees of freedom
AIC: 196.31
Number of Fisher Scoring iterations: 5
second model with quasibinomial error structure:
summary(model2)
Call:
glm(formula = y ~ fragment * type, family = quasibinomial)
Deviance Residuals:
1 2 3 4 5 6 7 8
-4.0175 3.3716 4.5052 -6.0071 -2.8063 0.5449 6.0414 -5.0184
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04433 3.63550 0.012 0.991
fragment -0.65477 2.32169 -0.282 0.792
type -0.46664 1.37073 -0.340 0.751
fragment:type 0.26636 0.86048 0.310 0.772
(Dispersion parameter for quasibinomial family taken to be 35.43628)
Null deviance: 157.96 on 7 degrees of freedom
Residual deviance: 153.74 on 4 degrees of freedom
AIC: NA
Number of Fisher Scoring iterations: 5
[[alternative HTML version deleted]]
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110614/e10e9232/attachment.pl>
On Jun 14, 2011, at 08:13 , Prof Brian Ripley wrote:
I presume you intended 'type' and 'fragment' to be factors (see below). Such a model would fit exactly. The additive model
model <- glm(y ~ fragment+type, binomial)
is only modestly over-dispersed, and shows that 'fragment' has zero effect. Not 'a negligible effect', but no effect. So something really odd is going on: is this an exercise with artificial data? Otherwise you need to explain the exact balance between the two 'fragments' (each fragment has exactly 1/4 success) and your assumption of independent binomial sampling cannot be true.
Also note that success+failure is exactly 102 in fragment 1 and 105 in fragment 2, as is the sum of the successes for each fragment (of course it has to to make exactly 1/4). It is rather easy to suspect that it is actually a 0/1 coding of the type (as in "tick exactly one box"), and not independent binomial data.
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20110614/5e063f57/attachment.pl>
On Jun 14, 2011, at 09:53 , Anna Mill wrote:
Also note that success+failure is exactly 102 in fragment 1 and 105 in fragment 2, as is the sum of the successes for each fragment (of course it has to to make exactly 1/4). It is rather easy to suspect that it is actually a 0/1 coding of the type (as in "tick exactly one box"), and not independent binomial data. sorry for the dumb question: so do you think, that my data is independent and the model appropriate? Thanks, Anna
Well, it's your data, and only you can tell what the original data looks like. We can only _suspect_ that they might be generated to be mutually exclusive. If you do not have independent binomial data, then a glm(..., binomial) will be seriously inappropriate (and a simple chi-square on the table of "successes" by type and fragment will be the obvious thing to do).
-- Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com
Peter Dalgaard Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com