Problem with the categorical predictor in the factor format at level 1 - R-SIG-mixed-models

Sunthud Pornprasertmanit · 2013-02-19T21:11:33Z

An embedded and charset-unspecified text was scrubbed... Name: not available URL:

Tue, Feb 19, 2013 6:18 PM #

Sunthud Pornprasertmanit <psunthud at ...> writes:

I believe this is a weakness in the way that lme4 constructs
random effects.  The problem is that it falls back on R's standard
model-matrix constructor (model.matrix()); in this case the formula
~0+SEX considered by itself gives rise to a "no-intercept" matrix,
which is *not* a one-column model matrix, but rather two columns 
each corresponding to a dummy variable for the corresponding factor level.

For example:

d <- data.frame(SEX=factor(0:1))
model.matrix(~SEX,data=d)
##   (Intercept) SEX1
## 1           1    0
## 2           1    1

model.matrix(~0+SEX,data=d)
##   SEX0 SEX1
## 1    1    0
## 2    0    1

rather than the model matrix you want, which is just

##    SEX1
## 1     0
## 2     1

The workaround is (as you have done) to create your own dummy
variable.

The other disturbing part of this is that the model with (~0+SEX|SCHOOL)
is actually unidentifiable (I think), but lmer goes ahead and fits
something for you anyway, without warning you.

This will definitely be worth posting an issue at
https://github.com/lme4/lme4/issues?state=open : if I get a
chance I will do it, but you are encouraged to do so ...