Skip to content

interpreting significance from lmer results for dummies (like me)

4 messages · Mark Kimpel, Andrew Robinson, Nick Isaac

#
Hi Mark,
On Fri, Apr 25, 2008 at 11:53:24PM -0400, Mark Kimpel wrote:
I'll take a stab.

1) the traditional, Fisher-style test of a null hypothesis is based on
   computing the probability of observing a test statistic as extreme
   or more extreme than the one actually observed, assuming that the
   null hypothesis is true.  This probability is called the p-value.
   If the p-value is less than some cut-off, e.g. 0.01, then the null
   hypothesis is rejected.

2) in order to compute that p-value, we need to know the cumulative
   distribution function of the test statistic when the null
   hypothesis is true. In simple cases this is easy: for example, we
   use the t-distribution for the comparison of two normal means (with
   assumed equal variances etc).

3) in (many) hierarchical models the cumulative distribution function
   of the test statistic when the null hypothesis is true is simply not
   known.  So, we can't compute the p-value.  

3a) in a limited range of hierarchical models that have historically
    dominated analysis of variance, e.g. split-plot designs, the
    reference distribution is known (it's F).  

3b) Numerous experts have (quite reasonably) built up a bulwark of
    intuitive knowledge about the analysis of such designs.

3c) the intuition does not necessarily pertain to the analysis of any
    arbitrary hierarchical design, which might be unbalanced, and have
    crossed random effects.  That is, the intuition might be applied,
    but inappropriately.

4) in any case, the distribution that is intuitively or otherwise
    assumed is the F, because it works in the cases mentioned in 3a.
    All that remains is to define the degrees of freedom.  The
    numerator degrees of freedom are obvious, but the denominator
    degrees of freedom are not known.

4a) numerous other packages supply approximations to the denominator
    degrees of freedom, eg Satterthwaite, and KR (which is related).
    They have been subjected to a modest degree of scrutiny by
    simulation.

5) however, it is not clear that the reference distribution is really
   F at all, and therefore it is not clear that correcting the
   denominator degrees of freedom is what is needed.  Confusion reigns
   on how the p-values should be computed.  And because of this
   confusion, Doug Bates declines to provide p-values.
There are numerous approximations, but no way to properly judge
significance as far as I am aware.  Try the R-wiki for algorithms, and
be conservative.  

http://wiki.r-project.org/rwiki/doku.php

Or, use lme, report the p-values computed therein, and be aware that
they are not necessarily telling you exactly what you want to know.
Not sure about that one.  I'm working on some simulations with Doug
but it's slow going, mainly because I'm chronically disorganised.
The conflict is not about p-values per se, but about the way that they
are calculated.  I would bet that the joint goal is to find an
algorithm that provides robust, reasonable inference in a sufficiently
wide variety of cases that its implementation proves to be worthwhile.

I hope that this was helpful.

Andrew
2 days later