Back to formatted view
Raw Message

Message-ID: <40e66e0b0909232044o79435305q2eaa9606b2960a7b@mail.gmail.com>
Date: 2009-09-24T03:44:36Z
From: Douglas Bates
Subject: I'm sorry, and here is what I mean to ask about speed
In-Reply-To: <13e802630909231842jc9c001eic921337c43db20fb@mail.gmail.com>

Thanks for rephrasing your question, Paul.

On Wed, Sep 23, 2009 at 8:42 PM, Paul Johnson <pauljohn32 at gmail.com> wrote:
> I'm sorry I made Doug mad and I'm sorry to have led the discussion off
> into such a strange, disagreeable place.
>
> Now that I understand your answers, I believe I can ask the question
> in a non-naive way. ?I believe this version should not provoke some of
> the harsh words that I triggered in my awkward question.
>
> New Non-Naive version of the Speed Question
>
> Do you have a copy of HLM6 on your system? ?Maybe you could help me by
> running the same model in R (with any of the packages such as lme4,
> nlme, or whatever) and HLM6 and let us all know if you get similar
> estimates and how long it takes to run each one.

I still claim it would help to have a reproducible example with known
data and a known model to fit.
> Here's why I ask.
>
> My colleague has HLM6 on Windows XP and he compared a two-level linear
> mixed effects model fitted with lmer from lme4 against HLM6. ?He
> surprised my by claiming that the HLM6 model estimation was completed
> in about 1.5 seconds and the lmer estimation took 50 seconds. ?That
> did not seem right to me. ?I looked a bit at his example and made a
> few mental notes so I could ask you what to look for when I go back to
> dig into this. ?There are 27000 cases in his datasets and he has about
> 25 variables at the lower level of observation and 4 or 5 variables at
> the higher level, which I think is the county of survey respondents.
> He is fitting a random intercept (random across counties) and several
> random slopes for the higher level variables.
>
> He pointed out that the mlWin website reported speed differences in
> 2006 that were about the same. ?Of course, you and I know that R and
> all of the mixed effects packages have improved significantly since
> then. That is why the speed gap on the one Windows XP system surprised
> me.
>
> Can you tell me if you see a difference between the two programs (if
> you have HLM6). ?If you see a difference on the same magnitude, it may
> mean we are not mistaken in our conclusion. ?But if you see no
> difference, then it will mean I'm getting it wrong and I should
> investigate more. If I can't solve it, I should provide a reproducible
> example for your inspection. ?I will ask permission to release the
> private data to ?you in that case.
>
> Perhaps you think there are good reasons why R estimation takes longer. ?E.g.:
> 1. HLM programmers have full access to benefit from optimizations in
> lmer and other open programs, but they don't share their optimizations
> in return.
> 2. lmer and other R routines are making calculations in a better way,
> a more accurate way, so we should not worry that they take longer.
> ? That was my first guess, in the original mail I said I thought that
> HLM was using PQL whereas lmer is using Laplace or Adaptive Gaussian
> Quadrature. ?But Doug's comment indicated that I was mistaken to
> expect a difference there because REML is the default in lmer and it
> is also what HLM is doing, and there's no involvement of quadrature or
> integral approximation in a mixed linear model (gaussian dependent
> variable).
>
> On the other hand, perhaps you are (like me) surprised by this
> difference and you want to help me figure out the cause of the
> differences. ?If you have ideas about that, maybe we can work together
> (I don't suck at C!). I have pretty much experience profiling programs
> in C and did some optimization help on a big-ish C++ based R package
> this summer.
>
> So far, I have a simple observer's interest in this question. ? I
> advise people whether it is beneficial for them to spend their scarce
> resources on a commercial package like HLM6 and one of the factors
> that is important to them is how "fast" the programs are. ? I
> personally don't see an urgent reason to buy HLM because it can
> estimate a model in 1 second and an open source approach requires 50
> seconds. ?But I'm not the one making the decision. If I can make the R
> version run almost as fast as HLM6, or provide reasons why people
> might benefit from using a program that takes longer, then I can do my
> job of advising the users.
>
> I am sorry if this question appears impertinent or insulting. I do not
> mean it as a criticism.
>
> --
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>