Skip to content

lmer, poisson family and offset variable

2 messages · Ivar Herfindal, Douglas Bates

#
Dear Mixed-Models-list

In poisson models, it can often be convenient with an offset variable, 
in order to account for known linear relationships that influence the 
counts (number of occurrences), e.g. observation time or patch size. In 
glm, (and, I hope, lmer), this is simply added as e.g. 
offset(log(exposure.time)) in the formula. This seems to work fine for 
me, and the parameter estimates, se and z-statistics seems reasonable. 
However, when I try the mcmcsamp to look at the distribution of the 
estimates, it looks horrible. Adding exposure time as fixed factor (i.e. 
log(exposure.time)) in the model instead of as an offset gives 
distributions that are reasonable and as expected. Do anyone know if 
offset is compatible with lmer and/or mcmcsamp, or can explain why the 
distribution become so weird when using exposure time as offset variable 
compared to as a fixed variable? Am I doing something completely wrong 
here?

I provide a short example below (the data is perhaps not very realistic 
in terms of values, but provide a a design close to the one I am using. 
It works as an example to show you how the distribution looks like with 
exposure time as offset or fixed variable).

Otherwise, the lme4 package works wonderful and is the package I use 
most frequent in R. Thanks a lot to the developer(s).

Sincerely

Ivar Herfindal

##
set.seed(100)
testdata <- cbind.data.frame(
    Obs=rpois(50, 20),
    gr=rep(1:10, 5),
    fixed=rnorm(50))
testdata$exp.time <- testdata$Obs - (rnorm(50, 1,1))
mod.offset <- lmer(Obs ~ offset(log(exp.time)) + fixed + (1|gr), 
data=testdata, family=poisson)
mod.offset.mcmcsamp <- mcmcsamp(mod.offset, 10000)
mod.no.offset <- lmer(Obs ~ (log(exp.time)) + fixed + (1|gr), 
data=testdata, family=poisson)
mod.no.offset.mcmcsamp <- mcmcsamp(mod.no.offset, 10000)

# histogram of parameter distribution with offset
windows()
par(mfrow=c(2,2))
for(i in 1:3)
hist(mod.offset.mcmcsamp[,i], col="gray", 
main=colnames(mod.offset.mcmcsamp)[i])

# histogram of parameter distribution without offset
windows()
par(mfrow=c(2,2))
for(i in 1:4)
hist(mod.no.offset.mcmcsamp[,i], col="gray", 
main=colnames(mod.no.offset.mcmcsamp)[i])

##
sessionInfo()
R version 2.4.1 (2006-12-18)
i386-pc-mingw32
attached base packages:
[1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  
"methods"   "base"    

other attached packages:
       lme4      Matrix     lattice
"0.9975-11"  "0.9975-8"   "0.14-16"
#
On 2/7/07, Ivar Herfindal <ivar.herfindal at bio.ntnu.no> wrote:
Thank you for providing the example.

I am not surprised that there is a problem with such an example.  I
can think of two possible causes - either I forgot to take into
account the offset when formulating the mcmcsamp or this could be
related to a generic problem with mcmcsamp for generalized linear
mixed models in which it gets stuck at particular values of the fixed
effects.  The mcmcsamp function applied to linear mixed models uses
exact sampling for three subsets of the parameters (\sigma^2, the
relative variance-covariance matrix for the random effects, and the
fixed effects and random effects sampled jointly).  I have used the
same approach in mcmcsamp applied to a generalized linear mixed model
but there is no exact sampling for the last group of parameters.  I
use a Metropolis-Hastings  method for those and sometimes it is
unsuccessful in moving the values of these parameters.

I would suggest looking at the time series plot of the MCMC sample
(see an earlier response on this list today for doing this with
library(coda); xyplot(...))  to see if the problem is the series
getting stuck.