A while back, I did run lmer using a very large model in Microsoft R vs R
and the timing was indeed faster for the same model on the same computer.
Not by any meaningful order of magnitude that would be life changing, but
faster nonetheless.
From: Douglas Bates <bates at stat.wisc.edu>
Date: Thursday, January 18, 2018 at 3:30 PM
To: AIR <hdoran at air.org>
Cc: Nicolas B?d?re <n.bedere at gmail.com>, "r-sig-mixed-models at r-project.org"
<r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] How can I make R using more than 1 core (8
available) on a Ubuntu Rstudio server ?
On Thu, Jan 18, 2018 at 2:16 PM Doran, Harold <HDoran at air.org> wrote:
@DB, I thought you were retired :)
I am retired. I'm just not very good at it and keep coming in to the
office to work on various projects.
But, to the OP, lme4 functions already take advantage of many
computational methods that make computing these models to large data sets
faster than (virtually) all other packages for estimating mixed linear
models.
The MixedModels package in Julia will usually perform at least as well as
lme4 and sometimes much better. Of course, using it entails learning a bit
of Julia. I would point out that with the RCall and RData packages for
Julia it is fairly straightforward to pass the data back and forth between
R and Julia.
The packages you might come across for parallel processing won't
necessarily apply here. For example, the foreach package is fantastic, but
could not be applied to a glmer model.
Although, Doug, I do recall coming across some work I think in the
Microsoft R distribution that did some parallel computing for matrix
problems by default. I'm saying this by memory and cannot recall specifics.
The Microsoft R distribution (and, before that, Revolution R) use the MKL
BLAS that I mentioned. Thanks for the reminder. It may be worthwhile
trying with lme4. Those benchmarks are somewhat disingenuous because they
only benchmark some linear algebra operations which is what MKL does very
well. Interestingly, the most important operation for statisticians -
obtaining least squares solutions - is not accelerated in the standard R
solution.
With that said, I'm not certain parallel processing is the right thing to
do with problems of this sort. Iteration t+1 depends on iteration t and
when solutions to the problem live on a different processor, the expense of
combining those things back together is not always faster, but instead can
actually be even more expensive and slower.
Parallelizing model fitting code is very tricky.
-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org]
On Behalf Of Douglas Bates
Sent: Thursday, January 18, 2018 3:07 PM
To: Nicolas B?d?re <n.bedere at gmail.com>
Cc: R SIG Mixed Models <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] How can I make R using more than 1 core (8
available) on a Ubuntu Rstudio server ?
The procedure is fairly simple - just rewrite the lme4 package from
scratch. :-)
On Thu, Jan 18, 2018 at 2:03 PM Nicolas B?d?re <n.bedere at gmail.com>
wrote:
I want to run the *glmer* procedure on a ?large? dataset (250,000
observations). The model includes 5 fixed effects, 2 interactions
terms and
3 random effects. It takes more than 15 min to run on my laptop
(recent intel core i7, RAM = 4GO). Thus, the IT department of the
University I am working at developed a Rstudio server based on the
Ubuntu system. My problem is that 8 cores are available on this server
but when I run the *glmer *procedure, only 1 of them is being used and
it takes more than 1h to get the results... How can I solve that
problem and improve time efficiency? I found on google I may have to
use the parallel procedure but (i) I am not familiar at all with those
informatics procedures and they look a bit complicated, (ii) the code
I picked works with other functions in other packages such as
*kmeans{stats}* (
https://stackoverflow.com/questions/29998718/how-can-i-make-r-use-more
-cpu-and-memory
)
but neither with *lmer *nor *glmer.*
Can you please help with a simple procedure to tackle the problem?
Many thanks !
[[alternative HTML version deleted]]