Skip to content

Improving computation time for a binary outcome in lme4

3 messages · Robin Jeffries, Douglas Bates, Adam D. I. Kramer

#
On Mon, May 24, 2010 at 7:46 PM, Robin Jeffries <rjeffries at ucla.edu> wrote:
There are at least two characteristics of the generalized linear mixed
model that are causing the increase in computational time.  The first
is the fact that the algorithm is based on iteratively reweighted
least squares (IRLS) and not ordinary least squares (OLS).  It is
inevitable that an iterative algorithm is slower than a direct
calculation.

The second cause is the fact that one can "profile out" the
fixed-effects parameters in a linear mixed-effects model but not in a
generalized linear mixed-effects model.  You can fake it to some
extent but the currently released version of the lme4 package doesn't.
 Thus, the greater the number of fixed-effects parameters, the greater
the complexity of the problem.

If you use the verbose option to lmer and to glmer on similar problems
you will see that lmer if optimizing over fewer parameters than is
glmer.
As I mentioned in my reply on R-help, the development version of the
lme4 package does have a sparseX option.  For a factor with 6 levels
it is unlikely that it will help.  The sparsity index of the X matrix
will be greater than 1/6 and that is close to the breakpoint where
dense methods, which do more numerical computation but less structural
analysis, are actually faster than sparse methods.
There are the usual suspects of getting access to a fast computer with
lots of memory and a 64-bit operating system.  You could see whether
an accelerated BLAS will help.  For example, Revolution R has the MKL
BLAS built-in.  Regrettably, that isn't always a speed boost.  We have
seen situations where multi-threaded BLAS actually slow down sparse
matrix operations because the communications overhead is greater than
the time savings of being able to perform more flops per second.
#
On Mon, 24 May 2010, Douglas Bates wrote:

            
Just one potentially useful observation: Turning on "verbose" makes the
waiting period much MUCH more tolerable. It's kinda like a progress bar--you
know glmer is doing something and that makes it easier to wait.

For some huge models with bigger-than-I-needed data sets (back in the
netflix prize days), I just let R run overnight and got what I wanted--but I
had never let it go more than an hour before I worried that it was looping.
My kingdom for multi-threaded nlm()...

--Adam