-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 15-06-22 11:36 AM, Ian Carroll wrote:
I raised this question yesterday on Cross Validated (
stats.stackexchange.com/q/158043/43122), but now realize this might
be the more appropriate forum for help. Please excuse the
cross-post if you track both.
I simulated count data with an observation level random effect,
then fit a Poisson-family GLMM using glmer (lme4 version 1.1-7).
The fitted random intercepts show a strange pattern when compared
to the simulated data, and produce a kinked QQ plot. If this is
expected, why is it okay? If not, what am I doing wrong?
Simulated data and fitting:
n <- 1000 df <- data.frame(obs=factor(1:n)) df$x <- rnorm(n) df$r
<- rnorm(n, sd=3) df$y <- rpois(n=n, lambda=exp(2*df$x + df$r))
glmer.fit <- glmer(y ~ x + (1|obs), family='poisson', data=df)
df$r.est <- ranef(glmer.fit)$obs[ , '(Intercept)'] plot(df$r,
df$r.est) qqnorm(df$r.est)
My results (posted on CV) are two overlapping clusters of points:
in one r and r.est show the expected correlation, in the other
there is no correlation. The normal QQ plot makes it look like
different normal distributions are obtained for positive and
negative intercepts.
Thanks, Ian
I think this is inevitable. If you think about it, the distribution
of the *conditional* modes of the observations with y=0 must be
different from that of the conditional modes of the obs. with y>0.
This is much easier to see with an even more trivial model that
doesn't include a covariate (see below ...)
I don't think actually violates any of the assumptions we're using.
If you're worried about its effects on estimation & inference you
could do some simulation examples to see if it affects whatever it is
you're interested in (type I error rate, bias/variance of parameter
estimation, etc.)
library(lme4)
library(plyr) ## for mutate
set.seed(101)
n <- 1000
df <- data.frame(obs=factor(1:n),x=rnorm(n))
df <- mutate(df,
r=rnorm(n, sd=3),
y=rpois(n=n, lambda=exp(2*x + r)),
y2 = rpois(n=n,lambda=exp(r)))
glmer.fit <- glmer(y ~ x + (1|obs), family='poisson', data=df)
glmer.fit2 <- update(glmer.fit,nAGQ=20)
glmer.fit3 <- update(glmer.fit, y2 ~ 1 + (1|obs))
## df$r.est <- ranef(glmer.fit2)$obs[ , '(Intercept)']
df$r.est <- ranef(glmer.fit3)$obs[ , '(Intercept)']
cvec <- c("#000000","#FF000080")
cvec2 <- cvec[as.numeric(df$y2==0)+1]
plot(df$r, df$r.est,col=cvec2)
qqnorm(df$r.est,col=cvec[as.numeric(df$y2==0)+1])
df$y2cat <- cut(df$y2,right=FALSE,include.lowest=TRUE,
breaks=c(0:6,Inf))
library("ggplot2"); theme_set(theme_bw())
ggplot(df,aes(r.est,fill=y))+
geom_histogram(alpha=0.5,position="identity",
aes(y = ..density..))
ggplot(df,aes(r.est,fill=ycat))+
geom_density(alpha=0.5,position="identity")
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAEBAgAGBQJViFhMAAoJEOCV5YRblxUH8gMIAMiLYfW2C57AYTyrVjb660wg
uvAkJGQjZ9ORIdm/MqDRydKTuNV16r3WfwSPXWCdLDBJjpdDq2w3dO5CxsbYjOH+
q7L0NWdX0KRjb1oKF2muLbz38dvH6Z4ExjOmJlW3funuVzCVvzhIGV7N/iIMqp5D
yTnK2y/YeVySuO91MqBgbKNtH/mGTfH1iGMbmhrssF1y8EfeWrw6RF7/uti8vhWE
cvAUDtkz74Q1gfJVyeCRPVRj7ADRb6tN7FuWbqukfEoK5etqYkO5QRIv6BLR4qoU
Hs+zPCymno6JZZDpY/D1UG5HChyXJZjFN6Xj7e+gnPtAvBUa5UTIxfy+z1cblT4=
=Ygc1
-----END PGP SIGNATURE-----