same old question - lme4 and p-values

Douglas Bates · 2008-04-08T16:53:04Z

On Mon, Apr 7, 2008 at 9:18 PM, David Henderson wrote: > Hi Doug: > > Perhaps I should clarify. The summary of a fitted lmer model does not > > provide p-values because I don't know how to calculate them in an > > acceptable way, not because I am philosophically opposed to them. The > > estimates and the approximate standard errors can be readily > > calculated as can their ratio. The problem is determining the > > appropriate reference distribution for that ratio

Douglas Bates

Tue, Apr 8, 2008 9:53 AM

On Mon, Apr 7, 2008 at 9:18 PM, David Henderson <dnadavenator at gmail.com> wrote:

Although the MCMC chain starts at the REML/ML estimates its iterations
are of the form

 - given the current residuals, sample a new value of $\sigma$ using a
random value from a $\chi^2$ distribution
 - given the current values of $\sigma$ and the variance-covariance of
the random effects, sample new values of the fixed-effects and the
random effects (in the Bayesian formulation both the fixed-effects
parameters and the random effects are regarded as random variables).
For a locally uniform prior on the fixed-effects this stage can be
reduced to sampling from a multivariate normal distribution.
 - given the current values of $\sigma$, the fixed effects, the
variance-covariance of the random effects and the random-effects
themselves sample from the distribution of the variance-covariance
parameters.
 - repeat the above three steps n times

It is the third step that gets tricky.  The "simple" approach is to
condition only on the values of the random effects and use a Wishart
distribution.  The problem with feedback is in steps 2 and 3.  If in
step 3 you happen to get a very small value of a variance component
then the next set of random effects sampled in step 2 will be small in
magnitude, resulting in the next sample of the variance component in
step 3 being very small, resulting in ...

I enclose a script to illustrate this effect using a model fit to data
from an experiment described in the classic book "Statistical Methods
in Research and Production" edited by O.L.Davies.  The "Yield"
variable is the amount of dyestuff in five different analyses of
samples from each of six different batches.  The REML estimates for a
simple random effects model reproduce the estimates of the variance
components shown in the book (there were at least 4 editions of the
book, the first in 1947, and all contain this example).  If you run
this script yourself and look at the plot of the samples from the
Bayesian posterior distribution of the 3 parameters versus the
iteration number for the MCMC sample, you will see the places where
the value of ST1, which is the standard deviation for batches divided
by the standard deviation for analyses within batch, gets stuck at
zero.  Those places also coincide with unusually large values of sigma
and attenuated variability of the distribution of the mean parameter.

(I don't enclose the plots because even the PDF files are very large
when you are plotting 10000 samples)

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Davies_R.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080408/7b69f7a7/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Davies_Rout.txt
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080408/7b69f7a7/attachment-0001.txt>

same old question - lme4 and p-values

Thread (29 messages)