Multi Processor / lme4

Thanks for the info, Ben.

It sounds like the best way for me to speed things up at the moment, then,
is to build R with a threaded linear algebra library.

-Nate
Actually, I don't know if that will help; most of the difficult linear
algebra in lme4 is sparse linear algebra, handled through the Matrix
package (wrapping Tim Davis's SuiteSparse library) and the RcppEigen
package.  I'm not sure how much of it really uses the standard BLAS
back-end.

  Someone with time on their hands could do some benchmarking and see
what happens.
http://cran.r-project.org/web/packages/gcbd/vignettes/gcbd.pdf? is a
good resource.

  If you're really interested in performance, and feeling adventurous, I
would definitely recommend Doug Bates's MixedModels package for Julia.

  It is also the case at the moment that lme4 is slower than lme4.0 for
some problems.

  If you have specific performance questions (rather than just "it would
be nice for lme4 to be faster", which I don't disagree with), it would
be good to give the parameters of your problem -- how many observations,
grouping variables, # of levels of grouping variables, LMM vs GLMM,
structure of grouping variables (nested, crossed, partially crossed) ... ?

  Ben Bolker
p.s. a duplicate of this message (sent from a non-member email address) is
waiting for moderation. feel free to delete.

On Sun, Apr 27, 2014 at 2:00 PM, Ben Bolker <bbolker at gmail.com> wrote:

  [This is a perfectly reasonable question for r-sig-mixed-models, so
I'm forwarding it there]

===================
This might be very naive.

I assume a very costly part of estimating parameters is evaluating the
likelihood -- particularly when the data are large. It seems like it'd
be fairly easy to distribute that evaluation across several processes
(i.e., multithread it) to speed up the procedure (e.g., mclapply or pvec
in R "parallel" package)

That said, I'd guess likelihood evaluation in (g)lmer actually happens
somewhere else where I am not so comfortable.

Any obvious reasons this map-reduce (I think it's called) sort of
technique is not in use?

Thanks for your time. -Nate

--
Nathan Doogan, Ph.D.
Post Doctoral Researcher
The Colleges of Social Work and Public Health
The Ohio State University

  It is indeed a little naive, but not silly at all.  The problem is
that it is *not* "fairly easy" to distribute the evaluation across
processors via map-reduce/mclapply etc..  Doug Bates has already done
all kind of wizardry to reduce the likelihood evaluation to linear
algebra operations that can be done very efficiently. While speeding the
process up enormously been doing some careful profiling with his Julia
code, and the largest single cost (if I am recalling things correctly)
is computing a sparse Cholesky decomposition.  He has been looking
*very* recently at the PaStiX library
<http://http://pastix.gforge.inria.fr/> as a possible way to parallelize
this operation, but it is not completely trivial.
  cheers
    Ben Bolker

	[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Multi Processor / lme4

Thread (6 messages)