Making lme4 faster for specific case of sparse x
On Tue, Aug 9, 2016 at 8:36 AM Patrick Miller <pmille13 at nd.edu> wrote:
Thanks for that clarification. In my situation, the effect of each predictor in X was allowed to vary by a single grouping variable. The lmer formula is something like the following: y ~ 1 + X1 + X2 + X3 + ... + ( 1 + X1 + X2 + X3 + ... | id)
Okay - that's not the same as X == Z but we'll let that slide. It is extremely unlikely that you will be able to fit such a model and get a meaningful result. Suppose that you have p columns in the fixed-effects model matrix, X ,and k levels of the id factor. The covariance matrix of the random effects will be p by p with p*(p + 1) / 2 distinct elements to estimate. It is difficult to estimate large covariance matrices with any accuracy. You would need k to be very, very large to have any hope of doing so. To make it worthwhile using a sparse representation of X you would need p to be large - in the hundreds or thousands - which would leave you trying to estimate tens of thousands of covariance parameters. It is just not on. If you feel you must fit this model because of the "keep it maximal" advice of Barr et al. (2013), remember that they reached that conclusion on the basis of a simulation of a model with one covariate. That is, they were comparing fitting 1 by 1 covariance matrix with fitting a 2 by 2 covariance matrix. To conclude on the basis of such a small simulation that everyone must always use the maximal model, even when it would involve tens or hundreds of covariance parameters, is quite a leap.
On Mon, Aug 8, 2016 at 6:08 PM, Douglas Bates <bates at stat.wisc.edu> wrote:
If X == Z don't you have problems with estimability? It seems that mle
would always correspond to all random effects being zero. Perhaps I misunderstand the situation. Could you provide a bit more detail on how it comes about that X == Z? On Mon, Aug 8, 2016 at 5:01 PM Patrick Miller <pmille13 at nd.edu> wrote:
Hello,
For my dissertation, I'm working on extending boosted decision trees to
clustered data.
In one of the approaches I'm considering, I use *lmer* to estimate random
effects within each gradient descent iteration in boosting. As you might
expect, this is computationally intensive. However, my intuition is that
this step could be made faster because my use case is very specific.
Namely, in each iteration, *X = Z*, and *X* is a sparse matrix of 0s and
1s
(with an intercept).
I was wondering if anyone had suggestions or (theoretical) guidance on
this
problem. For instance, is it possible that this special case permits
faster
optimization via specific derivatives? I'm not expecting this to be
implemented in lmer or anything, and I'm happy to work out a basic
implementation myself for this case.
I've read the vignette on speeding up the performance of lmer, and
setting calc.derivs
= FALSE resulted in about a 15% performance improvement for free, which
was
great. I was just wondering if it was possible to go further.
Thanks in advance,
- Patrick
--
Patrick Miller
Ph.D. Candidate, Quantitative Psychology
University of Notre Dame
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Patrick Miller Ph.D. Candidate, Quantitative Psychology University of Notre Dame