Bleeding edge lme4 (or lme4a) plus DF estimation
On Mon, Feb 27, 2012 at 4:48 PM, Joehanes, Roby (NIH/NHLBI) [F]
<roby.joehanes at nih.gov> wrote:
Hi Ben: Thank you very much. I think I will familiarize myself with the lme4Eigen code for a while. A few more questions: 1. I noticed that there is lme4Eigen version 0.9996875-9 in the repository: http://lme4.r-forge.r-project.org/repos/src/contrib/ Which SVN revision does it correspond to? I am currently using it and it appears to be quite stable.
The SVN revision number is stored in the DESCRIPTION file for recent versions of lme4Eigen. It is actually the revision number for the last check-in of the DESCRIPTION file but that gets updated fairly frequently so the revision number in there is a pretty tight lower bound.
2. It seems that the code for revision 1618 isn't that much different than the HEAD branch (I believe 1621 at the moment). Is it really that buggy? I am primarily interested in the lmer, not glmer.
The current revision (1623) should compile. The breakage was something I did on Friday and needed to commit because it was only on my laptop. I realized later that I had a modified version of RcppEigen on my laptop and without those modifications the compilation croaked. I created a version that doesn't use those particular modifications, at least for now.
3. (A different topic) Is there any way to speed up calling lmer on the same X and Z matrices but thousands of different y columns? In general linear model, we can invoke QR decomposition and use Q and R matrices to speed the calculations up. Is there such a decomposition (or method) that we can exploit to speed up the calculation?
Yes, check out the refit function. I just saw that the documentation suffers from cut-and-paste errors but the general idea is to give a fitted model a new response and run only the optimization step.
(fm1 <- lmer(Yield ~ 1|Batch, Dyestuff))
Linear mixed model fit by REML ['lmerMod']
Formula: Yield ~ 1 | Batch
Data: Dyestuff
REML criterion at convergence: 319.6543
Random effects:
Groups Name Variance Std.Dev.
Batch (Intercept) 1764 42.00
Residual 2451 49.51
Number of obs: 30, groups: Batch, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 1527.50 19.38 78.8
refit(fm1, Dyestuff2$Yield)
Linear mixed model fit by REML ['lmerMod']
Formula: Yield ~ 1 | Batch
Data: Dyestuff
REML criterion at convergence: 161.8283
Random effects:
Groups Name Variance Std.Dev.
Batch (Intercept) 0.00 0.000
Residual 13.81 3.716
Number of obs: 30, groups: Batch, 6
Fixed effects:
Estimate Std. Error t value
(Intercept) 5.6656 0.6784 8.352
Finally, thank you for the pointer on Kenward-Roger DF estimation. Sincerely, Roby -------- Roby Joehanes Research Associate Roby.Joehanes at nih.gov Building 12A, Room 2007 National Institutes of Health (NIH) Bethesda, MD 20892 P: (301) 402-8702 F: (301) 480-0028 or (301) 402-2867 On Feb 24, 2012, at 8:50 PM, Ben Bolker wrote:
Joehanes, Roby (NIH/NHLBI) [F] <roby.joehanes at ...> writes:
I learned about the impending release of the new version of lme4 (or lme4a) from Dr. Bates' post here: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2012q1/007499.html Firstly, just to make sure, is this based on lme4a? I have the source tarball of lme4a 0.999375-65 and found that a new optimizer, BOBYQA, is now in use, instead of the nlminb. The announcement above also mentioned the same change. So, I suspect that the older lme4 is supplanted with lme4a. Is this true?
?I'm not quite sure what you mean. ?There are three versions of lme4: * lme4 "classic" (version 0.999375-42 is the latest) lme4, built on nlminb ?(as was Bates's previous mixed-models package, nlme) *** lme4 will be preserved on CRAN and (probably) renamed "lme4.0" after the new version (see below) is released as lme4, for users who need backward compatibility (we thought this was a reasonable name: there are still a few issues with package names containing dots, so we may need to change this ...). * lme4a (0.9996875-1) uses bobyqa * lmeEigen (to be renamed lme4 when released, sometime soon ...) ?uses a mixture of bobyqa and a new Nelder-Mead implementation that allows box constraints, adapted from the nloptr package, which in turn wraps the NLopt open-source optimization codes (http://ab-initio.mit.edu/wiki/index.php/NLopt_Introduction).
Secondly, I would like to get a source tarball of the latest bleeding edge release to play with (or even read-only SVN access). I found from some sniff tests the lmer outputs of lme4a to be closely matched with those of SAS than those of the lme4 (with nlminb optimizer), except for the lack of p-values. I would love to play with the new version and even give you comparisons with the old version. The problem I am facing with lme4a 0.999375-65 is that it sometimes crashes (core dumps).
?That's easy: just go here for instructions on SVN access: http://r-forge.r-project.org/scm/?group_id=60 (the package is *so* bleeding edge that if you get it right now, you should roll back to SVN release 1618; releases 1619+ are currently broken ...)
Thirdly, I also would love to see the Satterthwaite or Kenward-Rogers DF estimation. I would like to try to add these features into lme4, if you will. I don't know much about the formulas to compute the DFs from quantities output by lmer / glmer. Any pointers?
Doug Bates is on record as saying that adapting the Kenward-Roger formulation to work with his code would be difficult <https://stat.ethz.ch/pipermail/r-help/2008-February/155372.html>, but Ulrich Halekoh and S?ren H?jsgaard have an implementation in the pbkrtest package <http://cran.r-project.org/web/packages/pbkrtest/index.html> which you could look at. ?Perhaps that would give you a hint about a Satterthwaite implementation as well ... [PS there is no "s" in "Kenward-Roger" -- this is a very common mistake, Kenward-Roger gets 1.03 million google hits while Kenward-Rogers gets 438K ...]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models