updating functions mangling original copies of glmer fits?
I thought others might be interested in this exchange that Ben and I had over New Year's. ---------- Forwarded message ---------- From: Douglas Bates <bates at stat.wisc.edu> Date: Thu, Jan 1, 2009 at 4:26 PM Subject: Re: updating functions mangling original copies of glmer fits? To: Ben Bolker <bolker at ufl.edu> Thanks for the note, Ben.
On Sun, Dec 28, 2008 at 11:08 AM, Ben Bolker <bolker at ufl.edu> wrote:
Dear Prof Bates,
in trying to replicate the deviance-profiling code in the
Implementation vignette, I've run across a slightly scary side
effect. The code below generates some data from the model
eps_b ~ Normal(0,1)
x ~ Uniform(0,1)
Linpred = 1+2*x+eps_b
Y ~ Poisson(exp(Linpred))
that is, a fairly standard GLMM with a single continuous
covariate, a single random effect on the intercept,
log link, Poisson, etc. (I don't think any of this matters
that much but I'm trying to be careful).
Then I bundled the code used to generate Figure 2 in
Implementation.Rnw, with the addition of a .Call("mer_update_dev")
to make it work for a GLMM, into a function called varprof
that evaluates the model for a series of different random-effects
standard deviations.
The answers make sense so far, but ... the **original model
fit object** (which has been passed into the function, so ought to have
had an independent copy made???) has now ended up being modified
by the profiling code, so that its standard deviation is
now equal to the maximum sd tried in the profile ...
I would try to sort this out but it feels like very deep R magic ...
What you have observed is indeed a violation of the functional semantics of the S language, briefly summarized as "Thou shalt not modify the value of an argument without copying". The magic to do this is not deep - inside a C function called through .Call you can do whatever you want, including modifying the value of an argument. My excuse for doing things this way is the usual: efficiency. The mer object encapsulates the model and its current state during the optimization process. It would be possible to copy and update that object every time the deviance is evaluated during the optimization but a) that object can be huge b) even with all the tricks that are used for optimization, it still requires many, many evaluations Even one copy of such an object can be expensive. There was a recent message to the R-SIG-Mixed-Models mailing list regarding being able to fit the model but running out of memory when trying to summarize it. The problem was that there is a copy made in the summary method. In the currently released version of lme4 I finessed the copying issue by modifying the object in place during the optimization and keeping the whole thing behind closed doors, as it were. I suppose that when I wrote code such as that in the Implementation vignette I should have added a "Don't try this at home, kids" caveat but instead I modified (yet again!) the overall approach. If you check the code in the "allcoef" branch of the SVN repository you will see that the model is represented by an environment through the optimization phase and only then converted to an S4 object. Environments are special composite objects in that they are never copied. If an environment is passed to a function and modified during the evaluation of the function then the modifications are visible outside the function. As such they are a natural way of representing "state" during an optimization. There are a couple of other (well, many other) enhancements in the allcoef branch. It uses an explicit call to nlminb for the optimization with all the information encapsulated in the environment and there is provision for incorporating additional parameters in the optimization. The name "allcoef" comes from a switch in the role of the fixed-effects parameters in a generalized linear mixed model or a nonlinear mixed model. Now, all the coefficients in the linear predictor are optimized during the IRLS iterations (the "inner" optimization, which is well-controlled) and not during the general, constrained optimization (the "outer" optimization performed by nlminb). Would it be okay for me to copy this reply to the R-SIG-Mixed-Models email list? Happy New Year, Doug Bates
I can obviously work around this by re-fitting the object after
computing the profile to restore it, but it does feel like a bug ...
Haven't tested yet whether this problem is specific to mer_update_dev
or whether it happens with LMMs as well ...
cheers
Ben Bolker
set.seed(1001)
x <- runif(1000)
f <- factor(rep(1:10,each=100))
blockeff <- rnorm(10,sd=1)
linpred <- exp(1+2*x+blockeff[f])
y <- rpois(1000,linpred)
dat <- data.frame(x,f,y)
plot(x,1+y,type="n",log="y")
text(x,1+y,as.character(f))
library(lme4)
mod <- glmer(y~x+(1|f),data=dat,family=poisson)
varprof <- function(mm,lower=0,upper=20,n=101) {
sg <- seq(lower, upper, len = n)
dev <- mm at deviance
nc <- length(dev)
nms <- names(dev)
vals <- matrix(0, nrow = length(sg), ncol = nc, dimnames = list(NULL,
nms))
for (i in seq(along = sg)) {
.Call("mer_ST_setPars", mm, sg[i], PACKAGE = "lme4")
.Call("mer_update_L", mm, PACKAGE = "lme4")
res <- try(.Call("mer_update_RX", mm, PACKAGE = "lme4"), silent = TRUE)
if (inherits(res, "try-error")) {
vals[i,] <- NA
} else {
.Call("mer_update_ranef", mm, PACKAGE = "lme4")
.Call("mer_update_dev", mm, PACKAGE = "lme4") ## added for glmmML
vals[i,] <- mm at deviance
}
}
vals
}
attr(VarCorr(mod)$f,"stddev") ## 1.633
logLik(mod) ## -535
vv <- varprof(mod)
attr(VarCorr(mod)$f,"stddev") ## 20
logLik(mod) ## -556!
sessionInfo()
R version 2.8.1 (2008-12-22) i486-pc-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats4 stats graphics grDevices utils datasets methods [8] base other attached packages: [1] emdbook_1.1.1.1 MASS_7.2-45 glmmADMB_0.3 glmmML_0.81-3 [5] lme4_0.999375-28 Matrix_0.999375-17 lattice_0.17-17 loaded via a namespace (and not attached): [1] coda_0.13-3 grid_2.8.1 rjags_1.0.3-4 -- Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc