Skip to content

Computational speed - MCMCglmm/lmer

9 messages · David Atkins, Paul Johnson, Douglas Bates +2 more

#
Hi all--

I use (g)lmer and MCMCglmm on a weekly basis, and I am wondering about 
options for speeding up their computations.  This is primarily an issue 
with MCMCglmm, given the many necessary MCMC iterations to get to 
convergence on some problems.  But, even with glmer(), I have runs that 
get into 20-30 minutes.

First, let me be very clear that this is in no way a criticism of Doug's 
and Jarrod's work (package developers for lme4 and MCMCglmm, 
respectively).  Their code has probably brought models/data into range 
that would not have been possible.

Second, I have included link to data and script below, along with some 
timings on my computer: Mac Book Pro, 2.5GHz, with 4GB RAM.  Here's 
sessionInfo from my runs:

 > sessionInfo()
R version 2.11.1 (2010-05-31)
i386-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] MCMCglmm_2.05      corpcor_1.5.6      ape_2.5-3
[4] coda_0.13-5        tensorA_0.35       lme4_0.999375-34
[7] Matrix_0.999375-41 lattice_0.19-7

loaded via a namespace (and not attached):
[1] gee_4.13-14   grid_2.11.1   nlme_3.1-96   stats4_2.11.1 tools_2.11.1

Specific questions:

1. Would be curious to know timings on other people's set-ups.  Jarrod 
and I had an exchange one time where he was gracious enough to run a 
zero-inflated model where I was concerned about convergence.  He ran a 
model with 1.3M iterations, which I think would take a number of days, 
if not a week on my computer.  This was part of what got me thinking 
about this.  Thus, my first interest is whether there is an "optimal" 
hardware/OS configuration, or does it matter?

Some things I see in R-help archives:

2. 32 vs. 64-bit: Seems like this is mostly an issue of data/model size 
and whether you need to access more than 4GB of RAM.  AFAICS, 64-bit 
processors are not necessarily faster.

3. "Optimized" BLAS: There's a bit of discussion about optimized BLAS 
(basis linear algebra... something).  However, these discussions note 
that there is no generally superior BLAS.  Not sure whether specific 
BLAS might be optimized for GLMM computations.

4. Parallel computing: With multi-core computers, looks like there are 
some avenues for splitting intensive computations across processors.

Finally, I'm asking here b/c I run into these issues with GLMM (and 
zero-inflated mixed models), though most of the discussion I've seen 
thus far about computation speed has been on R-help.

The data below are self-reported drinks (alcohol) from college students 
for up to the last 90 days.  Distribution of counts is zero-inflated.  I 
run a Poisson GLMM with glmer, over-dispersed Poisson GLMM with 
MCMCglmm, and then zero-inflated OD Poisson with MCMCglmm and provide 
timings for my set-up.

Okay, any and all thoughts welcomed.

thanks, Dave


### thinking through computational speed with lmer and MCMCglmm
#
### read drinking data
drink.df <- read.table(file = 
"http://depts.washington.edu/cshrb/newweb/stats%20documents/tlfb.txt",
						header = TRUE, sep = "\t")
str(drink.df) # 57K rows

### id is id variable (shocking)
### gender and weekday are 0/1 indicator variables
### drinks has number of drinks consumed

### distribution of outcome
table(drink.df$drinks)
plot(table(drink.df$drinks), lwd=2) # zero-inflated

### how many people?
length(unique(drink.df$id)) # 990
sort(table(drink.df$id))

### NOTE: most people have max of 90, which they should
###			two folks with 180 and 435 (prob data errors)
###			long negative tail down from 90
#
### NOTE: for this purpose, not worrying about the 180/435

### speed tests
#
### fit random intercept and random slope for weekday
### fixed effects for gender and weekday and interaction
#
### Poisson GLMM with glmer()
library(lme4)

system.time(
drk.glmer <- glmer(drinks ~ weekday*gender + (weekday | id),
					data = drink.df, family = poisson,
					verbose = TRUE)
)
summary(drk.glmer)

### timing
#
###    user  system elapsed
###  36.326   9.013  45.316
					
### over-dispersed Poisson GLMM with MCMCglmm()
library(MCMCglmm)

prior <- list(R = list(V = 1, n = 1),
				G = list(G1 = list(V = diag(2), n = 2)))
system.time(
drk.mcmc <- MCMCglmm(drinks ~ weekday*gender,
					random = ~ us(1 + weekday):id,
					data = drink.df, family = "poisson",
					prior = prior)
)
summary(drk.mcmc) # NOTE: using summary.MCMCglmm in v 2.05 of package

### timing
#
###        user   system  elapsed
### 	1034.317  165.128 1203.536

### zero-inflated, over-dispersed Poisson GLMM with MCMCglmm()
#
### NOTE: haven't run the following yet, other than a quick "toy run" to
###			sure that it is set up correctly.
### NOTE: this only has random intercept in each portion of the model

prior2 <- list(R = list(V = diag(2), n = 1, fix = 2),
				G = list(G1 = list(V = 1, n = 1),
						  G2 = list(V = 1, n = 1)))
system.time(
drk.zimcmc <- MCMCglmm(drinks ~ -1 + trait*weekday*gender,
					random = ~ idh(at.level(trait, 1)):id + idh(at.level(trait, 2)):id,
					rcov = ~ idh(trait):units,
					data = drink.df, family = "zipoisson",
					prior = prior2)
)
summary(drk.zimcmc)

### timing
#
###    	 user   system  elapsed
###    2105.366  544.881 2640.030
#
Thanks for making the data available, Dave.  I will be very interested
in seeing if I can make glmer more effective in fitting this model.

Right now I have a quick question, what are genders 0, 1, 2?  I
checked that if gender is consistent within subjects, which it is, so
you have 549 subjects of gender 0, 439 of gender 1 and 2 of gender 2.
If I wanted to label them as "M", "F" and "?" which labels should I
apply to genders 0 and 1 and what does gender 2 mean?
+                    {
+                        id      <- factor(id)
+                        gender  <- factor(gender)
+                        weekday <- factor(weekday, labels = c("N","Y"))
+                    })
'data.frame':	57318 obs. of  4 variables:
 $ id     : Factor w/ 990 levels "1014503","1017782",..: 1 1 1 2 2 2 2 2 2 2 ...
 $ gender : Factor w/ 3 levels "0","1","2": 1 1 1 2 2 2 2 2 2 2 ...
 $ weekday: Factor w/ 2 levels "N","Y": 2 1 1 2 1 1 2 2 2 2 ...
 $ drinks : int  0 0 5 8 5 10 0 0 0 0 ...
id        gender    weekday       drinks
 8396125:  435   0:32445   N:16633   Min.   : 0.0000
 6551092:  180   1:24780   Y:40685   1st Qu.: 0.0000
 1028078:   90   2:   93             Median : 0.0000
 1325095:   90                       Mean   : 0.9983
 1365806:   90                       3rd Qu.: 0.0000
 1411092:   90                       Max.   :45.0000
 (Other):56343
'data.frame':	990 obs. of  2 variables:
 $ id    : Factor w/ 990 levels "1014503","1017782",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ gender: Factor w/ 3 levels "0","1","2": 1 2 1 1 2 2 1 1 1 2 ...
id      gender
 1014503:  1   0:549
 1017782:  1   1:439
 1018311:  1   2:  2
 1026066:  1
 1028078:  1
 1036230:  1
 (Other):984
On Sat, Jun 19, 2010 at 10:42 AM, David Atkins <datkins at u.washington.edu> wrote:
#
Douglas Bates wrote:
Whoops, thought this data was a bit cleaner that apparently it is.

For gender, 0 = Female, 1 = Male

(Men drink more, so pretty easy to confirm...)

I would need to double-check on the 2s, though I do believe there is a 
category for transgendered on the demographics.

I'll try to get a cleaner dataset up soon.

D
2 days later
#
On Sat, Jun 19, 2010 at 10:42 AM, David Atkins <datkins at u.washington.edu> wrote:
Hi, Dave:

I've wondered this same thing. I replaced the base R BLAS with
GOTOBLAS2 and ATLAS and both are much faster than R's base BLAS.  In
Gotoblas2, computation is about 10 x faster on linear algebra
problems, especially on the kinds of problems where it can  thread
computations across all cores.  The BLAS library from Atlas does not
seem to thread, so it is not quite so fast.

In either case, I've tested your example on this Lenovo T61 laptop
with dual core Pentium that maxes out at 2.4GHz,

To calculate your model with the base R BLAS:

drk.glmer

 user  system elapsed
 29.920   0.120  30.245


The time elapsed with the optimized BLAS is not so much faster as I
had expected. With Atlas it is:

   user  system elapsed
 25.660   0.100  25.784

Gotoblas2 is almost identical, I'm quite surprised.  On other tests
I've done, it supplies a more noticeable speedup because it can go
multi core when needed.  I was monitoring the CPU and the calculations
all stay on one core.

   user  system elapsed
 25.670   0.050  25.725

Well, if you use Atlas or GOTOBLAS2, you can expect a speedup of about 1/6th.

I made the mistake of running that example with MCMCglmm in your code.
 The system is locked in mortal combat with that.  I didn't notice
your time was 1208. before I started that one.  :(

pj
#
On Tue, Jun 22, 2010 at 3:06 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:
On this particular model/data set combination.  Accelarated BLAS
change the speed of low-level numerical linear algebra operations, the
so-called basic linear algebra subroutines.  If those are the
bottleneck in your calculation you will see a performance boost.  If
it is not, you won't.

Accelerated BLAS are not a panacea.  Neither is parallel computation.
When your computation if essentially single-threaded, as an
optimization like this is, it doesn't matter if you have one core or
twelve.

The basic rule of optimizing performance of programs is to profile
*before* you make changes.  Making great efforts to optimize an
operation that takes only 5% of the execution time will provide you
with at most a 5% gain in performance.
Forgive me for sounding grouchy but I find this whole discussion
misguided.  Worrying about the speed of fitting a model and niceties
of the model formulation before doing elementary checks on the data is
putting the cart before the horse.  Why is gender coded as 0, 1 and 2?
 Why, when there was a maximum of 90 days of monitoring, is there one
id with 435 observations and another with 180 observations.  Did
someone really have 45 drinks in one day and, if so, are they still
alive?  Accelerated BLAS and parallel algorithms are way, way down the
list of issues that should be addressed.
#
Especially when you look at (e.g.) subset(drink.df, id == 8396125) and
notice that drinks appear to be repeated in blocks of five.  Looks
like a data processing error - but has it also happened for other
subject?

I also wonder why the date has been removed from the data. It seems
like this would be an important covariate (St Patrick's day, match
days, birthdays, ...).  And I hope weekday isn't Monday-Friday...

Hadley
#
First, my apologies for not providing clean data in my initial posting, 
which was clearly an oversight on my part.  I have an email in to the PI 
but haven't gotten a response yet.

I have attached an updated dataset, that pulls out the 2 suspect 
participants as well as those with gender coded 2.  In addition, it 
includes two more variables:

weekday - 7 level factor for day of week (new variable weekend now has 
the binary version)
fratsor - a binary factor indicating whether individual is part of a 
fraternity or sorority (or not)

I have an updated script below that shows (descriptively) how drinking 
varies over these factors.  Moreover, including weekday as either fixed 
or random, really ratchets up the time for glmer.  Again, my original 
purpose is definitely *not* to criticize lme4 (or Doug), but simply to 
get some feedback about computational speed, and whether or not it's 
possible to speed up runs via hardware, OS, or software.

Finally, in providing some data, I was trying to provide a brief subset 
of data that has certain realistic qualities (and it is real data) but 
is not a "full" data set.  In trying to be simple and straightforward, I 
may have raised more questions.

So, in response to a few questions raised:

The data come from a study of problematic drinking focused on high-risk 
events, and this particular data focused on 21st birthdays (here in the 
US, that is the legal drinking age).  The present data are from a survey 
(ie, non-intervention), which then informed a later intervention study. 
  Thus, the (up to) 90 days of retrospective drinking, will cover the 
individual's 21st birthday.

Doug asks about someone reporting 45 drinks.  Because we are capturing 
21st birthdays, we are definitely getting some extreme drinking, but we 
also are relying on self-report.  Clearly, there will be some errors and 
possible bravado about feats of drinking.  Moreover, the study has a 
protocol for addressing potentially dangerous levels of drinking with 
participants (when blood-alcohol content passes a certain cutoff).

Hadley asks about the repeated data.  Part of the study involves friends 
of the main participant (as one intervention condition includes friends 
input as part of the intervention).  Some of the friend's data were 
included in a larger version of this dataset, which then led to 
replicates of the participant's data.  Thus, we knew individuals with 
friend's data had replicates, but this version was supposed to have that 
cleaned up.

Finally, this particular data was collected on a rolling basis, and thus 
the 90 day recounting of drinking covers a large swath of the calendar 
year.  We are currently working on a paper that examines student 
drinking on typical weekdays vs. weekends vs. 21st birthday vs. holidays 
(including St. Patrick's Day, though July 4th and New Year's Eve are 
much heavier drinking days...).

Hopefully this clarifies certain aspects of the data, and I definitely 
appreciate folks input on the computational aspects -- I'm certainly 
learning.

[BTW, I'm currently fitting a second model to the one reported below, 
with a random effect for weekday -- currently at 30 minutes and counting.]

cheers, Dave



### updated 6.23.10 -- thinking through computational speed with lmer 
and MCMCglmm
#
### read in *updated* drinking data
drink.df <- read.table("drinksUPDATED.txt",
						header = TRUE, sep = "\t")
str(drink.df) # 57K rows

### NOTE: slight variable changes since last iteration
### id is id variable (shocking)
### gender is binary factor
### fratsor is binary factor (in fraternity or sorority vs. not)
### weekday now has day of week
### weekend is now weekday (Sun - Thurs) vs. weekend (Fri - Sat)
### drinks has number of drinks consumed

### trim trailing white spaces
levels(drink.df$weekday) <- sub('[[:space:]]+$', '', 
levels(drink.df$weekday))

### re-order factor
library(gdata)
drink.df$weekday <- reorder.factor(drink.df$weekday, new.order = 
c(4,2,6,7,5,1,3))
levels(drink.df$weekday)

### distribution of outcome
table(drink.df$drinks)
plot(table(drink.df$drinks), lwd=2) # zero-inflated

### how many people?
length(unique(drink.df$id)) # 980
sort(table(drink.df$id)) # max of 90 drinks

### how do drinks vary across days of week and also gender and fratsor?
#
with(drink.df, tapply(drinks, list(gender, fratsor), mean))
### not surprising, but men in fraternities drink notably more

with(drink.df, tapply(drinks, list(gender, weekday, fratsor), mean))
### general peak on friday and/or saturday, but differences in rest
### of week for fraternity/sorority crowd...

### speed tests
#
### fit random intercept and random slope for weekday
### fixed effects for gender and weekday and interaction
#
### Poisson GLMM with glmer()
library(lme4)

### NOTE: weekday is different from last set of code; last time it was
###			binary, this time it is 7 level factor
#
### random intercept
system.time(
drk.glmer <- glmer(drinks ~ weekday*gender + (1 | id),
					data = drink.df, family = poisson,
					verbose = TRUE)
)
summary(drk.glmer)

### timing
#
###    user  system elapsed
### 257.268  39.115 295.754
Hadley Wickham wrote:
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: drinksUPDATED.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20100623/f0ef694b/attachment.txt>
#
Hi Dave,

Sorry for the late response on this  - I've been doing field work.   I  
think there is little option for speeding up the time per iteration in  
MCMCglmm without considerable extra work. Most cpu are spent either  
solving the mixed model equations or generating random (normal)  
deviates.  It may be possible to multi-thread the latter but how easy  
this would be and how much of an increase in speed you would get I'm  
not sure.  For ZIP models at least, the long computing time is a  
result of poor mixing. The next version (as you know) will allow  
hurdle models to be fitted.  These have much better mixing properties  
and in terms of model fit are very similar to ZIP models.

Cheers,

Jarrod
On 19 Jun 2010, at 16:42, David Atkins wrote:

            

  
    
1 day later
#
First, thanks to Doug, Jarrod, Paul, and others who chimed in on this. 
I will take a stab at a summary:

For the most part, it sounds like there are no simple solutions for 
computational time.  Specifically, beyond obvious differences in 
hardware, there may be some minor gains from an optimized BLAS, but as 
Doug notes, this is not a panacea.

Doug and Jarrod both commented on the fact that at its heart, there is a 
single optimization process going on, so it's not clear how this could 
be threaded or parallelized (and it would certainly not be trivial to try).

Finally, Doug has an option in the development version of (g)lmer that 
allows "coarse" but faster fits.  This might be quite useful for initial 
model exploration.

Again, thanks all.

cheers, Dave



Dave Atkins, PhD
Research Associate Professor
Department of Psychiatry and Behavioral Science
University of Washington
datkins at u.washington.edu

Center for the Study of Health and Risk Behaviors (CSHRB)		
1100 NE 45th Street, Suite 300 	
Seattle, WA  98105 	
206-616-3879 	
http://depts.washington.edu/cshrb/
(Mon-Wed)	

Center for Healthcare Improvement, for Addictions, Mental Illness,
   Medically Vulnerable Populations (CHAMMP)
325 9th Avenue, 2HH-15
Box 359911
Seattle, WA 98104
http://www.chammp.org
(Thurs)
Jarrod Hadfield wrote: