An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080425/b80c5615/attachment.pl>
interpreting significance from lmer results for dummies (like me)
4 messages · Mark Kimpel, Andrew Robinson, Nick Isaac
Hi Mark,
On Fri, Apr 25, 2008 at 11:53:24PM -0400, Mark Kimpel wrote:
I am a bioinformatistician, with my strongest background in molecular biology. I have been trying to learn about mixed-effects to improve the analysis of my experiments, which certainly contain random effects. I will admit to being totally lost in the discussions regarding lack of p-value reporting in the current versions of lmer. Furthermore, I suspect those that need to publish to non-statistical journals will face reviewers who are equally in the dark. Where can I find a biologist-level explanation of the current controversy,
I'll take a stab.
1) the traditional, Fisher-style test of a null hypothesis is based on
computing the probability of observing a test statistic as extreme
or more extreme than the one actually observed, assuming that the
null hypothesis is true. This probability is called the p-value.
If the p-value is less than some cut-off, e.g. 0.01, then the null
hypothesis is rejected.
2) in order to compute that p-value, we need to know the cumulative
distribution function of the test statistic when the null
hypothesis is true. In simple cases this is easy: for example, we
use the t-distribution for the comparison of two normal means (with
assumed equal variances etc).
3) in (many) hierarchical models the cumulative distribution function
of the test statistic when the null hypothesis is true is simply not
known. So, we can't compute the p-value.
3a) in a limited range of hierarchical models that have historically
dominated analysis of variance, e.g. split-plot designs, the
reference distribution is known (it's F).
3b) Numerous experts have (quite reasonably) built up a bulwark of
intuitive knowledge about the analysis of such designs.
3c) the intuition does not necessarily pertain to the analysis of any
arbitrary hierarchical design, which might be unbalanced, and have
crossed random effects. That is, the intuition might be applied,
but inappropriately.
4) in any case, the distribution that is intuitively or otherwise
assumed is the F, because it works in the cases mentioned in 3a.
All that remains is to define the degrees of freedom. The
numerator degrees of freedom are obvious, but the denominator
degrees of freedom are not known.
4a) numerous other packages supply approximations to the denominator
degrees of freedom, eg Satterthwaite, and KR (which is related).
They have been subjected to a modest degree of scrutiny by
simulation.
5) however, it is not clear that the reference distribution is really
F at all, and therefore it is not clear that correcting the
denominator degrees of freedom is what is needed. Confusion reigns
on how the p-values should be computed. And because of this
confusion, Doug Bates declines to provide p-values.
how can I learn how to properly judge significance from my lmer results,
There are numerous approximations, but no way to properly judge significance as far as I am aware. Try the R-wiki for algorithms, and be conservative. http://wiki.r-project.org/rwiki/doku.php Or, use lme, report the p-values computed therein, and be aware that they are not necessarily telling you exactly what you want to know.
and what peer-reviewed references can I steer reviewers towards?
Not sure about that one. I'm working on some simulations with Doug but it's slow going, mainly because I'm chronically disorganised.
I understand, from other threads, that some believe a paradigm shift away from p-values may be necessary, but I it is not clear to me what paradigm will replace this entrenced view. I can appreciate the fact that there may be conflicting opinions about the best equations/algorithms for determining significance, but is there any agreement on the goal we are heading towards?
The conflict is not about p-values per se, but about the way that they are calculated. I would bet that the joint goal is to find an algorithm that provides robust, reasonable inference in a sufficiently wide variety of cases that its implementation proves to be worthwhile. I hope that this was helpful. Andrew
Andrew Robinson Department of Mathematics and Statistics Tel: +61-3-8344-6410 University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599 http://www.ms.unimelb.edu.au/~andrewpr http://blogs.mbs.edu/fishing-in-the-bay/
2 days later
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080428/73f127e7/attachment.pl>
An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20080429/2a8f45a1/attachment.pl>