Skip to content
Prev 11595 / 20628 Next

Curved residuals vs fitted plot

Hi Victoria,

Positive correlation between the residuals and fitted values is expected whenever you have a row-level random effect to mop up overdispersion. It happens because overdispersion implies that the low responses are lower than predicted and the high responses are higher than predicted, even allowing for binomial sampling error. The gap between a low predicted proportion and a still lower observed proportion is bridged partly by a negative residual and partly by a negative overdispersion random effect (which can be thought of as an additional 'free' residual to the constrained residual due to binomial sampling error). The same happens at the highest predicted proportions, except that both residual and random effect will be positive, hence the positive correlation between the predicted (fitted) values, which contain the overdispersion random effects, and the residuals.

This problem is dealt with, for binomial GLMMs, in chapter 7 of Alain Zuur's Beginner's guide to GLM and GLMM with R. I don't have this book handy and I can't remember his solution. One solution is to shift the overdispersion random effect from the fitted values to the residuals, which makes sense given that they are really just add-on residuals - see the example code below.

This thread discusses the same problem for lognormal-Poisson GLMMs:
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2013q3/020819.html

Best wishes,
Paul

# simulate binomial data with a positive trend and overdispersion
obs <- factor(1:100)
x <- seq(-1, 0.98, 0.02)
set.seed(12345678)
y <- rbinom(100, 20, plogis(x + rnorm(100, 0, 1)))
par(mfrow=c(3, 1))
plot(y/20 ~ x)
# fit logitnormal-binomial GLMM
fit <- glmer(cbind(y, 20-y) ~ I(1:100) + (1|obs), family="binomial")
summary(fit)
# standard residual vs fitted plot
plot(fitted(fit), resid(fit))
abline(h=0)
# shift overdispersion random effect from fitted values to residuals
Fitted <- plogis(qlogis(fitted(fit)) - ranef(fit)$obs[[1]])
Resid <- (y/20 - Fitted) / sqrt(Fitted * (1 - Fitted)/20)
plot(Fitted, Resid)
abline(h=0)
# NB THIS GETS RID OF THE TREND BUT ISN'T RIGHT!
# This is because I've standardised the variance using the 
# variance function for the binomial, p(1-p)/n, but the distribution
# underlying the residuals is now a logitnormal-binomial
# so should incorporate the overdispersion variance. I don't 
# know how to do this.