Skip to content
Prev 4767 / 7420 Next

Logistic regression with 2 categorical predictors

Andrew,

If the 24 rows are the data you are analysing, I cannot replicate any of your significant results within that glm framework *if* I take into account the overdispersion. The full model with age*test interaction is saturated and cannot be analysed at all, but the main effects model age+test can be analysed (or either term separately). However,  the results are overdispersed, and you should use family=quasibinomial instead of family=binomial, and then use test="F" instead of test="Chi":
Analysis of Deviance Table

Model: quasibinomial, link: logit

Response: cbind(prefer, avoid)

Terms added sequentially (first to last)


     Df Deviance Resid. Df Resid. Dev      F Pr(>F)
NULL                    23     54.908              
age   5  11.2352        18     43.673 0.9844 0.4591
test  3   1.5934        15     42.079 0.2327 0.8722

As you see, the Resid. Dev is much larger than Resid. Df for both terms in this sequential model, and therefore you must use quasibinomial models and F-tests -- and these give similar results as other tests.

I could not get any results for the saturated models, and one of your examples (below in this thread) seemed to use only one level of test and only *six* observations: it was also saturated as you had no replicates for your six age levels. You need replicates.

However, I'm not sure I understood your data correctly. It looks like you have a certain number of animals, but their number is reduced with age so that you have a kind of censored data (animals not available in all cases). Perhaps somebody can propose a better analysis for such a censored data, if it is like that.

Cheers, Jari Oksanen
On 24/10/2014, at 10:41 AM, Andrew Halford wrote: