glm offset and interaction bugs (PR#4941)
On Tue, 4 Nov 2003 charlie@stat.umn.edu wrote:
Full_Name: Charles J. Geyer Version: 1.8.0 OS: i686-pc-linux-gnu (Suse 8.2) Submission from: (NULL) (134.84.86.22) Two bugs (perhaps related, perhaps independent) revealed by the same Poisson regression with offset
Only one bug, i think. There is certainly a bug in handling offset() in formulas with interactions (which is presumably fairly recent, since the C code for offsets dates only from January this year). The latter two models, though, are the same AFAICS, or more precisely are reparametrisations of the same underidentified model. The only way to make + commutative in this example would be to have some arbitrary rule based on the name of the variable for which parameters to drop, and that would have the even more unintuitive effect that changing the name of a variable would affect the output.
mydata <- read.table(url("http://www.stat.umn.edu/geyer/5931/mle/seeds.txt"))
out.fubar <- glm(seedlings ~ burn01 + vegtype * burn02 +
offset(log(totalseeds)), data = mydata, family = poisson)
summary(out.fubar)
out.barfu <- glm(seedlings ~ burn01 + vegtype * burn02,
offset = log(totalseeds), data = mydata, family = poisson)
summary(out.barfu)
out.ok <- glm(seedlings ~ vegtype * burn02 + burn01,
offset = log(totalseeds), data = mydata, family = poisson)
summary(out.ok)
As far as I can tell from reading the documentation, these should produce
the same results. They don't. The regression coefficient for the
offset term in the first (fubar) regression is bogus. That's not what
offset() is supposed to do. Note that offset() works properly in
out <- glm(seedlings ~ vegtype + burn01 + burn02 + offset(log(totalseeds)),
data = mydata, family = poisson)
summary(out)
So is is only partially bogus -- very dangerous for users that are less
than hyperalert.
The difference between out.barfu and out.ok shows that "+" in formulas
is noncommutative, which is very mind bending.
The regression in out.ok is o. k. It checks by hand.
For a more complete explanation (if more is wanted), including
the printout from these summary commands on my machine and the
check of out.ok "by hand", see
http://www.stat.umn.edu/geyer/5931/mle/seed2.Rnw
http://www.stat.umn.edu/geyer/5931/mle/seed2.pdf
______________________________________________ R-devel@stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-devel
Thomas Lumley Assoc. Professor, Biostatistics tlumley@u.washington.edu University of Washington, Seattle