Nathan Leon Pace, MD, MStat
University of Utah
n.l.pace at utah.edu
W: 801.581.6393
F: 801.581.4367
M: 801.205.1019
> From: Douglas Bates <bates at stat.wisc.edu>
> Date: Thu, 1 Nov 2007 16:42:29 -0500
> To: Rob Forsyth <r.j.forsyth at newcastle.ac.uk>
> Cc: <r-sig-mixed-models at r-project.org>
> Subject: Re: [R-sig-ME] nlme and NONMEM
>
> On 11/1/07, Rob Forsyth <r.j.forsyth at newcastle.ac.uk> wrote:
>> I'd appreciate hearing from anyone (off list if you think it more
>> appropriate) who can share their comparative experiences of non-
>> linear mixed effects modelling with both nlme and NONMEM. The latter
>> appears the traditional tool of choice particularly in pharmacology.
>> Having built up some familiarity with nlme I am now collaborating (on
>> a non-pharmacological project) with someone strongly encouraging me
>> to move to NONMEM, although that clearly represents another
>> considerable learning curve. The main argument in favour is the
>> relative difficulty I have had in getting convergence with nlme
>> models in my relatively sparse datasets particularly when (as in my
>> case) I am interested in the random effects covariance matrix and
>> wish to avoid having to coerce it using pdDiag().
>
>> I note the following comment from Douglas Bates on the R-help archive
>
>>> The nonlinear optimization codes used by S-PLUS and R are different.
>>> There are advantages to the code used in R relative to the code used
>>> in S-PLUS but there are also disadvantages. One of the disadvantages
>>> is that the code in R will try very large steps during its initial
>>> exploration phase then it gets trapped in remote regions of the
>>> parameter space. For nlme this means that the estimate of the
>>> variance-covariance matrix of the random effects becomes singular.
>
>>> Recent versions of the nlme library for R have a subdirectory called
>>> scripts that contains R scripts for the examples from each of the
>>> chapters in our book. If you check them you will see that not all of
>>> the nonlinear examples work in the R version of nlme. We plan to
>>> modify the choice of starting estimates and the internal algorithms to
>>> improve this but it is a long and laborious process. I ask for your
>>> patience.
>
>> Can Doug or anyone comment on whether the development work on
>> lme4:::nlmer has included any steps in this direction or not?
>
> Yes.
>
> The algorithm in nlme alternates between solving a linear
> mixed-effects problem to update estimates of the variance components
> and solving a penalized nonlinear least squares problem to update
> estimates of the fixed-effects parameters and our approximation to the
> conditional distribution of the random effects. This type of
> algorithm that alternates between two conditional optimizations is
> appealing because each of the sub-problems is much simpler than the
> general problem. However it may have poor convergence properties. In
> particular it may end up bouncing back and forth between two different
> conditional optima.
>
> Also, at the time we wrote nlme we tried to remove the constraints on
> the variance components by transforming them away (In simple
> situations we iterate on the logarithm of the relative variances of
> the random effects.) This works well except when the estimate of the
> variance component is zero. Trying to reach zero when iterating on
> the logarithm scale can lead to very flat likelihood surfaces.
>
> In the nlmer function I use the same parameterization of the
> variance-covariance of the random effects as in lmer and use the
> Laplace approximation to the log-likelihood. Both of these changes
> should provide more reliable convergence, although the nlmer code has
> not been vetted to nearly the same extent as has the nlme code. In
> other words, I am confident that the algorithm is superior but the
> implementation may still need some work.
>
> Regarding NONMEM, I think the work Jose Pinheiro and I did on nlme and
> my current work on lme4 is based on a different philosophy than is the
> basis of NONMEM. As I have mentioned on this and other forums (fora?)
> I want to be confident that the results from the code that I write
> actually do represent an optimum of the objective function (such as
> the likelihood or log-likelihood). Nonlinear mixed-effects models for
> sparse data frequently end up being over-parameterized. In such cases
> I view it as a feature and not a bug that nlme or nlmer will indicate
> failure to converge. They may also fail to converge when there is a
> well-defined optimum. That behavior is not a feature.
>
> As I understand it from people who have used NONMEM (I once had access
> to a copy of NONMEM but was never successful in getting it to run and
> haven't tried since then) it will produce estimates just about every
> time it is run. Considering how ill-defined the parameter estimates
> in some nonlinear mixed-effects model fits can be, I don't view this
> as a feature.
>
> Many people feel that statistical techniques and statistical software
> are some sort of magic that can extract information from data, even
> when the information is not there. As I understand it from
> conversations many years ago with Lewis Sheiner, his motivation in
> developing NONMEM (with Stu Beal) was to be able to use routine
> clinical data (such as the Quinidine data in the nlme package) to
> estimate population pharmacokinetic parameters.
>
> Routine clinical data like these are very sparse. In the Quinidine
> example the majority of subjects have 1, 2 or 3 concentration
> measurements
>
>> table(table(subset(Quinidine, !is.na(Quinidine$conc))$Subject))
>
> 1 2 3 4 5 6 7 10 11
> 46 33 31 9 3 8 2 1 3
>
> and frequently these measurements are at widely spaced time points
> relative to the dosing schedule. Such cases contribute almost no
> information to the parameter estimates, yet I have had pharmacologists
> suggest to me that it would be wonderful to use study designs in which
> each patient has only one concentration measurement and somehow the
> magic of nonlinear mixed effects will conjure estimates from such
> data.
>
> The real world doesn't work like that. If you have only one
> observation per person it should make sense that no amount of
> statistical magic will be able to separate the per-observation noise
> from the per-person variability.
>
> So when I am told that NONMEM converged to parameter estimates on a
> problem where nlme or nlmer failed to converge I think (and sometimes
> say) "You mean NONMEM *declared* convergence to a set of estimates".
> Declaring convergence and converging can be different.
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models