absurd computiation times of lme - R-help

Christof Meigen · 2002-10-11T16:51:34Z

Hi Renaud, Renaud Lancelot writes: > It is because of the random effects (the estimations of the var-cov > random-effect matrix is very computer intensive). I think you would need > a very large data set to be able to estimate so many random-effect > parameters (21 parameters: 6 variances and 15 covariances). Well, in the case of the children I do have quite large datasets, around 1000 children with altogether much more than 5000 measurements. > May be it > would be e

Christof Meigen

Fri, Oct 11, 2002 9:51 AM #

Hi Renaud,

Renaud Lancelot <lancelot at sentoo.sn> writes:

Well, in the case of the children I do have quite large datasets,
around 1000 children with altogether much more than 5000 measurements.

JPPS, which models a whole growth curve, needs the adult height of the
child. Still it is very hard to fit, and optimization sometimes fails
to converge even on what looks like perfect data, see for example
"Mathematical models of growth in stature throuout childhood", by
Ledford & Cole.

JPPS has 6 Parameters, which is one more than I would use for the 
little kids. The point in using a spline basis was to make quite
little a priori assumptions on the model but let "lme" do the
job of deciding what the curve should look like. Unfortunatly, this
seems more than it can handle. 

I just thought it would be a good way to avoid that kind of
"why did you chose than model" questions, which might lead to
very annoing discussions if someone puts effort into rejecting
all of your views.

        Christof


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Douglas Bates

Fri, Oct 11, 2002 2:43 PM #

Christof Meigen <christof at nicht-ich.de> writes:

But you are also implicitly estimating the random effects for each
child.  These are sometimes regarded as 'nuisance' parameters but they
still need to be estimated, at least implicitly.  In this case there
would be about 6000 of them (1000 children by 6 random effects per
child).

I would recommend that you start with a spline model for the fixed
effects but use either a simple additive shift for the random effects
(random = ~1|Subject) or an additive shift and a shift in the time
trend (random = ~ age | Subject).  You simply don't have enough data
to estimate 6 parameters from the data for each child.

There is a big difference when fitting random effects between adding
parameters in the fixed effects, which are estimated from all the
data, and adding parameters in the random effects, which are estimated
from the data for one subject.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard

Tue, Oct 15, 2002 6:24 AM #

Christof Meigen <christof at nicht-ich.de> writes:

This stuff can be tricky. We've had some success a couple of years back
with variants of models like


lme.obj <- lme(log(Height)~ns(sqrt(Age),knots=sqrt(c(0.25,.5,1,5)),
  Boundary.knots=c(0,sqrt(10))), random=~sqrt(Age)|ID,
  correlation=corExp(value=c(-log(0.2),0.1),form=~sqrt(Age),nugget=T))

This was also on children, albeit severely growth-retarded, and with a
less regular sampling than you seem to have. Notice that we felt it
was necessary to add some sort of correlation structure. One gotcha
which gave us convergence problems turned out to be due to the
starting value of corExp() being keyed to the *minimum* distance
between two measurements on the same individual. The really crucial
thing, however, was the need to work with transformations on both the
x and y axes in order to roughly stabilize the variance and avoid
"ringing" effects of fitting sharp gradients at small ages.

Well, I'm sure that Doug and Jose would appreciate any insights that
could help them speed up the algorithms...

O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Christof Meigen

Tue, Oct 15, 2002 7:16 AM #

Hi,

thanks a lot for all your hints. Alas, some problems remain

Douglas Bates <bates at stat.wisc.edu> writes:

I'm aware of that, and would not dare to estimate these parameters
independently per child, since I would be overfitting the data.

But I thought one could use lme to constain this flexibility by
using information derived from the rest of the population. If this would
only mean subtracting the mean curve, I woulnd't need lme, would I:

Does this really mean that the estimates for the random effects are
totally independent from the rest of the data? So, if my random
effect is flexible enough to model more ore less any curve,
this "any" curve will be fitted to the data no matter how unlikely
it is (looking at the rest of the population) and how little
data is availiable on this subject?

The point is that the inclusion criterium for the children is that
they have _at least_ measurments in each quarter, but some have
measurements every month or so. I thought lme would be a good 
way to deal with this difference in the amount of information available.

Bad enough, this is for an PhD (luckily not mine) about growth
velocity. The medical Prof sees no problem, saying: when you
have two measurements you have a growth velocity for the timepoint
right between these measurements. 

I think this is a bad approach and suggested to smooth the curves
before. The approach of using (random = ~ age | Subject) or,
as seen from looking at the log's, better (random = ~ age^0.15 | Subject),
works as expected, but gives fits which are sometimes as far as
3 cm from the real measurements (while the measurement error
is assumed to be about 0.5cm). These unusual decelerations are
exactly what the wannabe-PhD is interested in.

Finally, the argument with the "too little data" does not apply
to the set-up with with Berkeley Boys, with each one 31 measurements,
where a 7-parameters spline basis random effect wouldn't converge
within several hours.

        Christof
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Douglas Bates

Tue, Oct 15, 2002 10:10 AM #

Christof Meigen <christof at nicht-ich.de> writes:

I don't think I said that it was as simple as subtracting the mean
curve.  Mixed-effects models do perform a type of regularization of
the individual estimates but this is not as simple as subtracting the
mean curve.

Once again, I didn't say the estimates were totally independent from
the rest of the data.  What I was trying to say is that adding
several, possibly correlated, random effects for each subject adds
much more complexity than does adding parameters to the fixed effects.

Modelling with random effects causes some 'shrinkage to the mean'
relative to fitting each subject's data separately.  However, if you
are going to estimate a large number of random effects and their
variances and their covariances you need to have a large amount of
data for each subject and it will take a long time.

If you have 7 parameters in the random effects and a general
variance-covariance matrix (i.e. symmetric, positive-definite but with
no further constraints on its form) there are seven variances and 21
covariances to estimate.  Optimization is with respect to another
parameterization but it still has dimension 28 in this case.  Roughly
speaking, optimization problems have exponential complexity and it
does not surprise me that this would take a very long time.  You must
ask yourself if you think that the ways in which these growth curves
differ has that great a dimensionality.  In most cases I think it is a
more effective modeling strategy to start with a few random effects
and check residuals to see if the model needs to be made more complex
instead of starting with an overly complex model.

As Peter suggested, if you feel that lme is inadequate for your
purposes we invite you to write better software.



-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Christof Meigen

Wed, Oct 16, 2002 7:11 AM #

Douglas Bates <bates at stat.wisc.edu> writes:

I discussed that with Jim Ramsay, and he strongly discouraged me 
to "guess a reasonable, low dimensional basis", using either
known models for growth curves or something like PCA.

His argument was, that, in the first case, I put too much a priori
assumtions into the model, and in the other, that I use results
from the data to analyze the data, which would be some kind of
statistical "deadly sin".

He suggested, and that seemed plausible to me, to use indeed a
much too complex/flexible model like a quite high-dimensional
spline basis, and use lme to constrain that flexibility.

It seems to turn out that lme is not really made for this ...

...which does not mean I'd dare to say that I could write "better"
software, I had a look at the code when I was trying to write
a band-structured pdMat, and I'm really impressed ...

        Christof
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._