Skip to content

nlme and NONMEM

3 messages · Rob Forsyth, Douglas Bates, Nathan Leon Pace, MD, MStat

#
I'd appreciate hearing from anyone (off list if you think it more  
appropriate) who can share their comparative experiences of non- 
linear mixed effects modelling with both nlme and NONMEM. The latter  
appears the traditional tool of choice particularly in pharmacology.  
Having built up some familiarity with nlme I am now collaborating (on  
a non-pharmacological project) with someone strongly encouraging me  
to move to NONMEM, although that clearly represents another  
considerable learning curve. The main argument in favour is the  
relative difficulty I have had in getting convergence with nlme  
models in my relatively sparse datasets particularly when (as in my  
case) I am interested in the random effects covariance matrix and  
wish to avoid having to coerce it using pdDiag().

I note the following comment from Douglas Bates on the R-help archive
Can Doug or anyone comment on whether the development work on  
lme4:::nlmer has included any steps in this direction or not?

Thanks

Rob Forsyth
#
On 11/1/07, Rob Forsyth <r.j.forsyth at newcastle.ac.uk> wrote:
Yes.

The algorithm in nlme alternates between solving a linear
mixed-effects problem to update estimates of the variance components
and solving a penalized nonlinear least squares problem to update
estimates of the fixed-effects parameters and our approximation to the
conditional distribution of the random effects.  This type of
algorithm that alternates between two conditional optimizations is
appealing because each of the sub-problems is much simpler than the
general problem.  However it may have poor convergence properties.  In
particular it may end up bouncing back and forth between two different
conditional optima.

Also, at the time we wrote nlme we tried to remove the constraints on
the variance components by transforming them away (In simple
situations we iterate on the logarithm of the relative variances of
the random effects.)  This works well except when the estimate of the
variance component is zero.  Trying to reach zero when iterating on
the logarithm scale can lead to very flat likelihood surfaces.

In the nlmer function I use the same parameterization of the
variance-covariance of the random effects as in lmer and use the
Laplace approximation to the log-likelihood.  Both of these changes
should provide more reliable convergence, although the nlmer code has
not been vetted to nearly the same extent as has the nlme code.  In
other words, I am confident that the algorithm is superior but the
implementation may still need some work.

Regarding NONMEM, I think the work Jose Pinheiro and I did on nlme and
my current work on lme4 is based on a different philosophy than is the
basis of NONMEM.  As I have mentioned on this and other forums (fora?)
I want to be confident that the results from the code that I write
actually do represent an optimum of the objective function (such as
the likelihood or log-likelihood).  Nonlinear mixed-effects models for
sparse data frequently end up being over-parameterized. In such cases
I view it as a feature and not a bug that nlme or nlmer will indicate
failure to converge.  They may also fail to converge when there is a
well-defined optimum.  That behavior is not a feature.

As I understand it from people who have used NONMEM (I once had access
to a copy of NONMEM but was never successful in getting it to run and
haven't tried since then) it will produce estimates just about every
time it is run.  Considering how ill-defined the parameter estimates
in some nonlinear mixed-effects model fits can be, I don't view this
as a feature.

Many people feel that statistical techniques and statistical software
are some sort of magic that can extract information from data, even
when the information is not there.  As I understand it from
conversations many years ago with Lewis Sheiner, his motivation in
developing NONMEM (with Stu Beal) was to be able to use routine
clinical data (such as the Quinidine data in the nlme package) to
estimate population pharmacokinetic parameters.

Routine clinical data like these are very sparse. In the Quinidine
example the majority of subjects have 1, 2 or 3 concentration
measurements
1  2  3  4  5  6  7 10 11
46 33 31  9  3  8  2  1  3

and frequently these measurements are at widely spaced time points
relative to the dosing schedule.  Such cases contribute almost no
information to the parameter estimates, yet I have had pharmacologists
suggest to me that it would be wonderful to use study designs in which
each patient has only one concentration measurement and somehow the
magic of nonlinear mixed effects will conjure estimates from such
data.

The real world doesn't work like that.  If you have only one
observation per person it should make sense that no amount of
statistical magic will be able to separate the per-observation noise
from the per-person variability.

So when I am told that NONMEM converged to parameter estimates on a
problem where nlme or nlmer failed to converge I think (and sometimes
say) "You mean NONMEM *declared* convergence to a set of estimates".
Declaring convergence and converging can be different.
#
Hi All,

This thread reminds me of an experience using nlme about 10 years ago. I was
remodeling a previously analyzed (and published) pharmacokinetic data set on
the drug remifentanil; NONMEM had been used to estimate a 3 compartment (6
parameter) model. The data included multiple plasma concentration values for
each subject.

Using nlme, no convergence was possible for a three compartment model
despite various choices of the control language and covariance structure. A
two compartment model converged.

Doug provided very useful tips to me at the time. For example, a visual
inspection of the raw data (time course of remifentanil concentration decay)
revealed only one inflection point in the decay curves for most subjects,
whereas two inflection points would be consistent with a three compartment
model. The data was not sufficient to fit a three compartment model.

I have never used NONMEM. Speaking to associates using NONMEM in the 90s,
they assured me that NONMEM could always be tweaked to converge. This was
considered a virtue.

This is an example NONMEM allowing overparametrized models.

Nathan