same old question - lme4 and p-values

On Sun, Apr 6, 2008 at 9:05 PM, David Henderson
Hi John:
 > For all practical purposes, a CI is just the Bayesian credible
 > interval that one gets with some suitable "non-informative prior".
 > Why not then be specific about the prior, and go with the Bayesian
 > credible interval?  (There is an issue whether such a prior can
 > always be found.  Am right in judging this no practical consequence?)
 What?  Could you explain this a little more?  There is nothing
 Bayesian about a classical (i.e. not Bayesian credible set or highest
 posterior density, or whatever terminology you prefer) CI.  The
 interpretation is completely different, and the assumptions used in
 deriving the interval are also different.  Even though the interval
 created when using a noninformative prior is similar to a classical
 CI, they are not the same entity.
 Now, while i agree with the arguments about p-values and their
 validity, there is one aspect missing from this discussion.  When
 creating a general use package like lme4, we are trying to create
 software that enables statisticians and researchers to perform the
 statistical analyses they need and interpret the results in ways that
 HELP them get published.  While I admire Doug for "drawing a line in
 the sand" in regard to the use of p-values in published research, this
 is counter to HELPING the researcher publish their results.  There has
 to be a better way to further your point in the community than FORCING
 your point upon them.  Education of the next generation of researchers
 and journal editors is admittedly slow, but a much more community
 friendly way of getting your point used in practice.
Perhaps I should clarify.  The summary of a fitted lmer model does not
provide p-values because I don't know how to calculate them in an
acceptable way, not because I am philosophically opposed to them.  The
estimates and the approximate standard errors can be readily
calculated as can their ratio.  The problem is determining the
appropriate reference distribution for that ratio from which to
calculate a p-value.  In fixed-effects models (under the "usual"
assumptions) that ratio is distributed as a T with a certain number of
degrees of freedom.  For mixed models it is not clear exactly what
distribution it has - except in certain cases of completely balanced
data sets (i.e. the sort of data sets that occur in text books).  At
one time I used a T distribution and an upper bound on the degrees of
freedom but I was persuaded that providing p-values that could be
strongly "anti-conservative" is worse than not providing any.

That decision not to provide p-values is particularly inconvenient to
many users who are not especially interested in statistical niceties
but do need to satisfy editors or referees who want to see p-values.
I know that is a real problem.  My earlier comment about having
created a monster that now turns on us, which touched off this line of
discussion, was more about the fact that we try to take complex
analyses and reduce the conclusions from them to a single number, the
p-value. We can provide considerable information about the models that
are fit to the experimenter's data but without p-values the
experimenter may be unable to publish the results.

The approach that I feel is most likely to be successful in
summarizing these models is first to obtain the REML or ML estimates
of the parameters then to run a Markov chain Monte Carlo sampler to
assess the variability in the parameters (or, if you prefer, the
variability in the parameter estimators).  (Note: I am not advocating
using MCMC to obtain the estimates, I suggest MCMC for assessing the
variability.)

The current version of the mcmcsamp function suffers from the
practical problem that it gets stuck at near-zero values of variance
components.  There are some approaches to dealing with that.  Over the
weekend I thought that I had a devastatingly simple way of dealing
with such cases until I reflected on it a bit more and realized that
it would require a division by zero.  Other than that, it was a good
idea.

The practical problem with the mcmcsamp function at present is th
 Just my $0.02...

 Dave H
 --
 David Henderson, Ph.D.
 Director of Community
 REvolution Computing
 1100 Dexter Avenue North, Suite 250
 206-577-4778 x3203
 DNADave at Revolution-Computing.Com
 http://www.revolution-computing.com

 _______________________________________________
 R-sig-mixed-models at r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

same old question - lme4 and p-values

Thread (29 messages)