Timings with SAMM, lme4, nlme
I also meant to cc: the list on this response. Obviously I need to work on my email skills. ---------- Forwarded message ---------- From: Douglas Bates <bates at stat.wisc.edu> Date: Feb 1, 2007 12:39 PM Subject: Re: [R-sig-ME] Timings with SAMM, lme4, nlme To: Kevin Wright <kw.statr at gmail.com>
On 2/1/07, Kevin Wright <kw.statr at gmail.com> wrote:
About a year and a half ago I did some comparison of model fitting with
SAMM, lme4, and nlme. Since Doug Bates put out a request for some recent
timings, I am repeating/extending my comparisons. In the interim, I have
switched computers, so the timings of the samm function facilitate comparing
the speed across the two computers.
First, the results (view with fixed-width font)
Setup T30 D620 T30 D620 D620 D620
Model \ Function samm samm lmer lmer lmer2 lme
---------------- ---- ---- ---- ---- ----- -----
yrs|id + yrs|sch f f 30.0 kill 65
yrs|id + 1|sch 10.5 7.8 13.7 kill 23
1|id + yrs|sch 9.4 9.5 11.5 kill 7.3
1|id + 1|sch 6.5 5.7 5.6 kill 5.4 60
yrs|sch 4.8 4.0 f kill 0.5
1|sch 3.4 2.2 0.4 0.7 0.3 4
yrs|id 5.7 5.7 8.2 kill 15 150
1|id 4.1 3.4 2.4 kill 4.2 14
In the table above, "Setup" refers to the following two computer
configurations:
T30: IBM Thinkpad T30, 1 GB ram, 1.8 Ghz processor, Windows 2000
SAMM version 1.1, lme4 & nlme current on 5.24.2005
D620: Dell Latitude D620, 2 GB ram, 1.8 Ghz duo core processor, WinXP
SAMM version 1.1 lme4 & nlme current as of 1.30.2007
I timed most model fits only once. I did a quick inspection of the results
from the different modelling functions to persuade myself that I was fitting
the same model (i.e. that estimates were similar).
Thanks for sending the timings, Kevin. As Harold Doran mentioned in his reply to you, it is best to use the control options niterEM = 0 and gradient = FALSE when fitting lmer models to large data sets. The theory of how to calculate the gradient and how to create the ECME step for a linear mixed model is some of the most beautiful mathematics that I have ever done. Unfortunately it is not very practical mathematics because one of those calculations is the equivalent of several hundred function evaluations on models with large data sets. It is much better to just let the optimizer work out an approximation to the gradient and to skip the EM or ECME iterations altogether. I repeated your timings on R-forge.R-project.org, an Opteron-based server running Debian Linux. I enclose the results. Once you get rid of the ECME iterations and the gradient calculations lmer and lmer2 are in the same ballpark with a slight advantage to lmer on most of the models. Both functions took about a minute or less to fit models with up to 20000 random effects to about 25000 observations, which is not bad at all. As you know, not all these models can be fit by lme and, as the timings show, lme is much slower on the models that it can fit. This is not surprising, there has been close to a decade's worth of development of the computational methods since lme was initially designed.
Here are the full models I used:
lmer(math~ gr + sx + eth + cltype + (1+yrs|id) + (1+yrs|sch), data=star)
samm(math ~ gr + sx + eth + cltype, random=~ us(link(~1+yrs)):id +
us(link(~1+yrs)):sch, data=star, na.method.X="omit")
Starting with these full models, I tried reduced models with simpler random
effects structures. These are the different rows in the table above.
Observations:
(1) The performance of the common version of SAMM on the two computers suggests
the Dell is slightly faster than the IBM.
(2) On 24 May 2005, lmer and samm has roughly similar timings. On 31 Jan
2007, lmer is nearly unusable for this data (I killed the job after 5-10
minutes of waiting).
(3) The current version of lmer2 is the only function that appears to fit
all models.
(4) The current version of lme is slower than sammm/lmer2 for
those models I tried to fit.
FYI. SAMM is officially available at
http://www.vsni.co.uk/products/samm/(but seems not to be there now,
perhaps in preparation for release of a new
version). Unofficially it is available here: ftp://ftp.dpi.nsw.gov.au/
I'm not sure if I have access to SAMM to check the timings. Is SAMM proprietary? My recollection is that it is proprietary and that it is Windows-only. Either one of those characteristics is the kiss of death for my wanting to use it. Are you sure that SAMM is fitting a model with partially-crossed random effects? If you are only considering students and schools as grouping factors for the random effects there will be little difference between assuming nested random effects and correcting for the partial crossing. This is because there are very few students in this study who attend more than one school. If you considered teachers as well as students and schools as grouping factors then you could more easily discern the difference in model fits taking into account the crossing or assuming that each student:teacher:school combination is unique.
I hope this information is useful. Thanks for the progress evident in the
lme4 package.
Kevin Wright
[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-------------- next part --------------
Random.effects lmer lmer2 lme
(yrs|id)+(yrs|sch) 51.203 62.640 NA
(yrs|id)+(1|sch) 14.497 22.533 NA
(1|id)+(yrs|sch) 13.449 5.188 NA
(1|id)+(1|sch) 2.540 2.056 NA
(yrs|id) 6.409 9.992 84.037
(1|id) 0.628 1.020 5.637
(yrs|sch) 0.860 0.533 3.544
(1|sch) 0.544 0.472 2.404
-------------- next part --------------
library(lme4)
data(star, package = 'mlmRev')
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|id) + (yrs|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|id) + (1|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (1|id) + (yrs|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (1|id) + (1|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (1|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|id),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer(math ~ gr + sx + eth + cltype + (1|id),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|id) + (yrs|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|id) + (1|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|id) + (yrs|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|id) + (1|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|sch),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|id),
star, control = list(niterEM = 0, gradient = FALSE)))
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|id),
star, control = list(niterEM = 0, gradient = FALSE)))
-------------- next part --------------
library(nlme)
data(star, package = 'mlmRev')
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~yrs|sch, na.action = na.omit))
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~1|sch, na.action = na.omit))
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~yrs|id, na.action = na.omit))
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~1|id, na.action = na.omit))
-------------- next part --------------
R version 2.5.0 Under development (unstable) (2007-01-31 r40628)
Copyright (C) 2007 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
Natural language support but running in an English locale
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
[Previously saved workspace restored]
library(nlme) data(star, package = 'mlmRev') system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~yrs|sch, na.action = na.omit))
user system elapsed 3.544 0.168 3.713
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~1|sch, na.action = na.omit))
user system elapsed 2.404 0.128 2.532
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~yrs|id, na.action = na.omit))
user system elapsed 84.037 1.588 85.672
system.time(lme(math ~ gr + sx + eth + cltype, star, random = ~1|id, na.action = na.omit))
user system elapsed 5.637 0.124 5.759
proc.time()
user system elapsed 97.006 2.064 99.109 -------------- next part -------------- R version 2.5.0 Under development (unstable) (2007-01-31 r40628) Copyright (C) 2007 The R Foundation for Statistical Computing ISBN 3-900051-07-0 R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. Natural language support but running in an English locale R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. [Previously saved workspace restored]
library(lme4)
Loading required package: Matrix Loading required package: lattice
data(star, package = 'mlmRev') system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|id) + (yrs|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 51.203 6.944 58.218
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|id) + (1|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 14.497 0.172 14.669
system.time(lmer(math ~ gr + sx + eth + cltype + (1|id) + (yrs|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 13.449 0.140 13.624
system.time(lmer(math ~ gr + sx + eth + cltype + (1|id) + (1|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 2.540 0.080 2.623
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 0.860 0.064 0.925
system.time(lmer(math ~ gr + sx + eth + cltype + (1|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 0.544 0.060 0.606
system.time(lmer(math ~ gr + sx + eth + cltype + (yrs|id),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 6.409 0.392 6.802
system.time(lmer(math ~ gr + sx + eth + cltype + (1|id),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 0.628 0.084 0.711
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|id) + (yrs|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 62.640 4.204 66.843
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|id) + (1|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 22.533 0.944 23.492
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|id) + (yrs|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 5.188 0.216 5.406
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|id) + (1|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 2.056 0.032 2.089
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 0.533 0.032 0.565
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|sch),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 0.472 0.016 0.486
system.time(lmer2(math ~ gr + sx + eth + cltype + (yrs|id),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 9.992 0.348 10.364
system.time(lmer2(math ~ gr + sx + eth + cltype + (1|id),
+ star, control = list(niterEM = 0, gradient = FALSE))) user system elapsed 1.020 0.024 1.046
proc.time()
user system elapsed 204.576 13.948 218.748