Skip to content
Prev 308598 / 398506 Next

glm.nb - theta, dispersion, and errors

On Sun, 21 Oct 2012, Eiko Fried wrote:

            
feeling is that this is caused by the type of response you have. It is not 
really a count response and might be better modeled by an ordered response 
model. Below I've simulated data from an ordered probit model and then 
pretended they are count data. The result are the symptoms you describe: 
underdispersion, non-convergence for theta in NB. Fitting an ordered 
probit model, avoids these problems for this data and (not surprisingly) 
reasonably recovers the true parameters.

hth,
Z

## random regressor and latent Gaussian variable
set.seed(1)
x <- runif(200)
y0 <- 5 * x + rnorm(200)

## collapse to factor with 5 levels
yf <- cut(y0, c(-Inf, 3, 4, 5, 6, Inf), labels = 0:4)

## pretend levels are count variables
yc <- as.numeric(as.character(yf))

## quasi-Poisson with underdispersion
summary(glm(yc ~ x, family = quasipoisson))

## NB with problems in theta estimation
summary(glm.nb(yc ~ x))

## ordered probit recovers true parameters
summary(polr(yf ~ x, method = "probit", Hess = TRUE))

## compare with the latent linear model
summary(lm(y0 ~ x))