negative binomial regression
No modification is required. The standard way in S to handle offsets is
via the offset() function, and that works in glm.nb. The offset argument
to R's glm is unnecessary.
See ?Insurance and try
glm.nb(Claims ~ District + Group + Age + offset(log(Holders)),data =
Insurance)
(which is not over-dispersed and so gives some warnings).
On Mon, 24 Mar 2003, Ross Nelson wrote:
I would like to know if it is possible to perform negative binomial
regression with rate data (incidence density) using the glm.nb (in
MASS) function.
I used the poisson regression glm call to assess the count of injuries
across census tracts. The glm request was adjusted to handle the data
as rates using the offset parameter since the population of census
tracts can vary by a factor of three.
eg. Call:
glm(formula = inj ~ lp + rdm, family = poisson(), data = ww,
offset = log(pop))
Deviance Residuals:
Min 1Q Median 3Q Max
-17.2779 -2.6034 -0.4519 2.0837 16.9275
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.11593 0.01482 -75.290 < 2e-16 ***
lp2 0.11569 0.01477 7.835 4.70e-15 ***
lp3 0.02374 0.01763 1.346 0.178
lp4 0.17777 0.01922 9.248 < 2e-16 ***
rdm2 -0.08810 0.01747 -5.044 4.57e-07 ***
rdm3 0.08750 0.01533 5.706 1.15e-08 ***
rdm4 0.10513 0.01518 6.925 4.35e-12 ***
---
Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
inj and pop are interval, while lp and rdm are categorical.
A test of the dispersion indicates that the data is over dispersed, and
thus that an alternative distribution should be used.
I am not sure, however, if or how to modify the glm.nb to handle this
situation.
glm.nb(formula, ..., init.theta, link = log)
Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595