Confidence interval for relative contribution of random effect variance - R-SIG-mixed-models

Thu, Sep 11, 2014 2:16 AM #

Dear all,

earlier this year, I started to look into how I could estimate the relative contribution of the between-subjects variability relative to several levels of (summed) within-subject variabilities in the context of mixed models. Because I am using nlme and lme4 since quite a while that is where I started to look. In the end, I would like to have not only a point estimate but also a measure of precision, that is, either a confidence or credible interval.

When I started to look into this, I came across a two-year old suggestion by Ben Bolker that relied on mcmcsamp (http://stats.stackexchange.com/questions/30797/posterior-simulations-of-the-variances-with-the-mcmcsamp-function) which I liked because the Markov-chain would allow me to calculate the measure of interest for each simulation step and accordingly calculate e.g. an HPD-interval of the ratio of between- versus the (summed) within-subject variability.

Now, doing some more research in the R-archives, help files and vignettes, I realize that I have been off the sig-mixed-models list for too long (due to work load and yes, I will try to be better in future ;-) and that mcmcsamp is no longer supported/developed. On the other hand the function confint () now exists. Many thanks to the developers!

A side-line: Using the confint function on one of my models and comparing the confidence intervals with the point-estimates from the summary of the same model, it seems that confint reports confidence intervals for the estimated standard deviations of the random effects as well as of the error-variability whereas summary reports the standard deviations for the random effects but the variance for the residuals. Is this correct? I seem to remember some such discussion but could not find any note online that would have verified this fact. Page 31 in "Fitting linear mixed-effects models using lme4" discusses this part of the summary output but seems to be using the terms standard deviation and variance somewhat interchangeably (or, more likely, I failed to read it correctly).

Now, apart from this aspect, can confint be tweaked to calculate not only the confidence interval of the 'raw' parameters but also for some function of the parameters? If not, do I need to move to an implementation using MCMC methods (MCMCglmm, Bugs-type of approaches, STAN or Laplaces-Demon) to reach my aim or do you have another (simpler) suggestion?

Many thanks and regards, Lorenz
-
Lorenz Gygax, PD Dr. sc. nat., Scientist
Federal Food Safety and Veterinary Office FFSVO
Centre for Proper Housing of Ruminants and Pigs
T?nikon, CH-8356 Ettenhausen, Switzerland

Ben Bolker

Thu, Sep 11, 2014 3:24 PM #

<lorenz.gygax at ...> writes:

Dear all,

earlier this year, I started to look into how I could estimate the
relative contribution of the between-subjects variability relative
to several levels of (summed) within-subject variabilities in the
context of mixed models. Because I am using nlme and lme4 since
quite a while that is where I started to look. In the end, I would
like to have not only a point estimate but also a measure of
precision, that is, either a confidence or credible interval.

When I started to look into this, I came across a two-year old
suggestion by Ben Bolker that relied on mcmcsamp
(http://stats.stackexchange.com/questions/30797/
  posterior-simulations-of-the-variances-with-the-mcmcsamp-function)
which I liked because the Markov-chain would allow me to calculate
the measure of interest for each simulation step and accordingly
calculate e.g. an HPD-interval of the ratio of between- versus the
(summed) within-subject variability.

Now, doing some more research in the R-archives, help files and
vignettes, I realize that I have been off the sig-mixed-models list
for too long (due to work load and yes, I will try to be better in
future and that mcmcsamp is no longer supported/developed. On the
other hand the function confint () now exists. Many thanks to the
developers!

A side-line: Using the confint function on one of my models and
comparing the confidence intervals with the point-estimates from the
summary of the same model, it seems that confint reports confidence
intervals for the estimated standard deviations of the random
effects as well as of the error-variability whereas summary reports
the standard deviations for the random effects but the variance for
the residuals. Is this correct? I seem to remember some such
discussion but could not find any note online that would have
verified this fact. Page 31 in "Fitting linear mixed-effects models
using lme4" discusses this part of the summary output but seems to
be using the terms standard deviation and variance somewhat
interchangeably (or, more likely, I failed to read it correctly).

Hmmm.  The output of 

fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
summary(fm1)

gives


Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.09   24.740       
          Days         35.07    5.922   0.07
 Residual             654.94   25.592       
Number of obs: 180, groups:  Subject, 18

  which shows both the variance and the standard deviation (i.e.
*not* the uncertainty estimate, just the point estimate of the
variability on both the variance and the standard deviation scales)

You can compute parametric bootstrap confidence intervals of
any quantity you want by applying boot.ci() to the results of bootMer()
(bootMer()'s second argument is the summary function, which you
can define however you like).  This is computationally expensive,
though (even more expensive than MCMC-type computations).

  In principle you might be able to use likelihood profiling
(which is what the default confint() method uses) to compute
profile likelihood confidence intervals of arbitrary quantities,
but you would need to be able to constrain an optimization algorithm
to the specified values (i.e., you would need to set nonlinear
equality constraints; there are functions in nloptr and elsewhere
(many of them called auglag()) that implement an augmented Lagrange
multiplier algorithm for such constraints, but I haven't tried it
out to see how it works.

The advantage of parametric bootstrap/MCMC approaches is that
you also get a finite-size-appropriate result; likelihood profiling
would inherit the asymptotic assumptions of the likelihood ratio test.

glmmADMB still implements a post-hoc MCMC sampling strategy simpler
to mcmcsamp (but you would be on your own for making sure the
chain was well-behaved, etc.)

  Ben Bolker

lorenz.gygax at agroscope.admin.ch

Fri, Sep 12, 2014 4:05 AM #

Dear Ben,

Many thanks for your input.

[... snip]

Ok. The latter may not be such an issue. This sounds doable and I will be looking into it! (And I can report back on my success ...)

This sound rather daunting and I fear that I am not up to this ...

Ok that would be another avenue.

Many thanks again! Regards, Lorenz