updating functions mangling original copies of glmer fits?

Fri, Jan 2, 2009 8:07 AM

I thought others might be interested in this exchange that Ben and I
had over New Year's.

---------- Forwarded message ----------
From: Douglas Bates <bates at stat.wisc.edu>
Date: Thu, Jan 1, 2009 at 4:26 PM
Subject: Re: updating functions mangling original copies of glmer fits?
To: Ben Bolker <bolker at ufl.edu>

Thanks for the note, Ben.

On Sun, Dec 28, 2008 at 11:08 AM, Ben Bolker <bolker at ufl.edu> wrote:

What you have observed is indeed a violation of the functional
semantics of the S language, briefly summarized as "Thou shalt not
modify the value of an argument without copying".  The magic to do
this is not deep - inside a C function called through .Call you
can do whatever you want, including modifying the value of an
argument.

My excuse for doing things this way is the usual: efficiency.  The mer
object encapsulates the model and its current state during the
optimization process.  It would be possible to copy and update that
object every time the deviance is evaluated during the optimization
but
 a) that object can be huge
 b) even with all the tricks that are used for optimization, it still
requires many, many evaluations

Even one copy of such an object can be expensive.  There was a recent
message to the R-SIG-Mixed-Models mailing list regarding being able to
fit the model but running out of memory when trying to summarize it.
The problem was that there is a copy made in the summary method.

In the currently released version of lme4 I finessed the copying issue
by modifying the object in place during the optimization and keeping
the whole thing behind closed doors, as it were.  I suppose that when
I wrote code such as that in the Implementation vignette I should have
added a "Don't try this at home, kids" caveat but instead I modified
(yet again!) the overall approach.  If you check the code in the
"allcoef" branch of the SVN repository you will see that the model is
represented by an environment through the optimization phase and only
then converted to an S4 object.  Environments are special composite
objects in that they are never copied.  If an environment is passed to
a function and modified during the evaluation of the function then the
modifications are visible outside the function.  As such they are a
natural way of representing "state" during an optimization.

There are a couple of other (well, many other) enhancements in the
allcoef branch.  It uses an explicit call to nlminb for the
optimization with all the information encapsulated in the environment
and there is provision for incorporating additional parameters in the
optimization.  The name "allcoef" comes from a switch in the role of
the fixed-effects parameters in a generalized linear mixed model or a
nonlinear mixed model.  Now, all the coefficients in the linear
predictor are optimized during the IRLS iterations (the "inner"
optimization, which is well-controlled) and not during the general,
constrained optimization (the "outer" optimization performed by
nlminb).

Would it be okay for me to copy this reply to the R-SIG-Mixed-Models email list?

Happy New Year,
Doug Bates

 I can obviously work around this by re-fitting the object after
computing the profile to restore it, but it does feel like a bug ...

 Haven't tested yet whether this problem is specific to mer_update_dev
or whether it happens with LMMs as well ...

 cheers
   Ben Bolker


set.seed(1001)
x <- runif(1000)
f <- factor(rep(1:10,each=100))
blockeff <- rnorm(10,sd=1)
linpred <- exp(1+2*x+blockeff[f])
y <- rpois(1000,linpred)
dat <- data.frame(x,f,y)

plot(x,1+y,type="n",log="y")
text(x,1+y,as.character(f))
library(lme4)
mod <- glmer(y~x+(1|f),data=dat,family=poisson)

varprof <- function(mm,lower=0,upper=20,n=101) {
 sg <- seq(lower, upper, len = n)
 dev <- mm at deviance
 nc <- length(dev)
 nms <- names(dev)
 vals <- matrix(0, nrow = length(sg), ncol = nc, dimnames = list(NULL,
nms))
 for (i in seq(along = sg)) {
   .Call("mer_ST_setPars", mm, sg[i], PACKAGE = "lme4")
   .Call("mer_update_L", mm, PACKAGE = "lme4")
   res <- try(.Call("mer_update_RX", mm, PACKAGE = "lme4"), silent = TRUE)
   if (inherits(res, "try-error")) {
       vals[i,] <- NA
     } else {
       .Call("mer_update_ranef", mm, PACKAGE = "lme4")
       .Call("mer_update_dev", mm, PACKAGE = "lme4") ## added for glmmML
       vals[i,] <- mm at deviance
     }
 }
 vals
}

attr(VarCorr(mod)$f,"stddev") ## 1.633
logLik(mod) ## -535
vv <- varprof(mod)
attr(VarCorr(mod)$f,"stddev") ## 20
logLik(mod)  ## -556!

sessionInfo()

R version 2.8.1 (2008-12-22)
i486-pc-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] emdbook_1.1.1.1    MASS_7.2-45        glmmADMB_0.3
glmmML_0.81-3
[5] lme4_0.999375-28   Matrix_0.999375-17 lattice_0.17-17

loaded via a namespace (and not attached):
[1] coda_0.13-3   grid_2.8.1    rjags_1.0.3-4

--
Ben Bolker
Associate professor, Biology Dep't, Univ. of Florida
bolker at ufl.edu / www.zoology.ufl.edu/bolker
GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc