Teaching Mixed Effects

Fri, Jan 30, 2009 10:22 AM

The accumulation of quoted posts in this thread is quite long, so I 
have trimmed out all but some paragraphs from Doug Bates that prompt my 
comments.

[ ... ]

If we want to perform a hypothesis test related to a fixed-effects
term in a mixed model (and, for the moment, I will not go into the
question of whether statistical inferences should always be phrased as
the result of hypothesis tests) I would claim we should start at the
beginning, which is considering two models for the data at hand, one
model being a special case of the other.  We need to decide how we
measure the quality of the fit of the general model relative to the
more specific model and how we measure the additional cost of the
general model.  Then we need to formulate a test statistic.  If we are
incredibly lucky, the null distribution of this test statistic will be
well-defined (that is, it will not depend on the values of other,
unknown parameters) and we can evaluate probabilities associated with
it.  That does happen in the case of the Gaussian linear model.  I
personally don't think it will be possible to possible to provide a
general approach that isolates the effect of a fixed-effect term in a
linear mixed model using a statistic that does not depend on the
values of other parameters.  I would be delighted if someone can do it
but I think there is too much that goes right in the case of the
Gaussian linear model to expect that the same incredible
simplifications will apply to other models.

I don't feel that holy grail of inference in mixed effects models
should be a magical formula for degrees of freedom to be associated
with some ratio that looks like a t or an F statistic (despite the
strongly held beliefs of those in the First Church of the
Kenward-Roger Approximation). Certainly there has been a lot of
statistical research related to approximating a difficult distribution
by a more common distribution but I view this approach as belonging to
an earlier era.  It is the approach embodied in software like SAS
whose purpose often seems to be to evaluate difficult formulas and
provide reams of output including every number that could possibly be
of interest.  I think we should use the power of current and future
computers to interact with and explore data and models for the data.
MCMC is one way to do this.  In nonlinear regression Don and I
advocated profiling the sum of squares function with respect to the
values of individual parameters as another way of assessing the actual
behavior of the model versus trying to formulate an approximation.
I'm sure that creative people will come up with many other ways to use
the power of computers to this end.  The point is to explore the
actual behavior of the model/data combination, not to do just one fit,
calculate a bunch of summary statistics, apply approximate
distributions to get p-values and go home.

But a goal of this exploration of the model/data is still to come up 
with an approximation to a sampling distribution (of a test statistic 
or parameter estimate), right?  Ultimately, once all the exploration is 
done, the scientific researcher still wants to be able to tell his/her 
colleagues "Based on these data my conclusion about the effect of 
treatment X is ______ and my confidence in this conclusion is _____."  
That is, the researcher wants to make *inferences* from the data.

This paragraph brings to mind some comments about p-values and
hypothesis 
tests that seem popular on this list.  Among users of lme4 a theme seems
to 
be that "this ignorant editor/referee is insisting that I demonstrate
that 
my discovery of the effect of treatment X is not a false positive.  How
can 
I get around this?"

Every field of science needs to protect its literature from being
overwhelmed 
by false "discoveries".  Since all the professional and economic rewards
go to 
those who make discoveries, there is enormous incentive to claim that
one 
has found an effect or association, and so there needs to be a reality
check.  
The conventional way of judging these claims is via p-values and
hypothesis 
tests.  Granting all of their faults and limitations, what do people
propose 
as a better way?  

Some will probably say we should focus on (interval) estimation of
parameters 
rather than testing.  But this doesn't solve the problem at hand.
Remember 
how this discussion started:  the difficulty of calculating reliable
p-values 
for fixed effects.  Because of the duality between hypothesis tests and
confidence 
intervals, if you can't get reliable p-values, then essentially by
definition 
you can't get reliable confidence intervals either.  So the issue of
testing-
versus-estimation is a red herring with respect to the deeper problem of

inference for fixed effects in mixed models.

Others may try to dodge the problem by claiming to be Bayesians,
calculating 
credible intervals from posterior distributions.  But this won't hold
water 
if they are using lme4 and mcmcsamp.  No honest Bayesian can use an 
"uninformative" prior (if such a thing even exists), or at least not
more 
than once in any area of research:  after analysis of one data set, the 
posterior from that analysis should inform the prior for the next, but
lme4 
has its priors hard-coded.  I think the real rationale for mcmcsamp is
the 
hope that it will produce results with good frequentist properties.  I
am 
not aware that this has been demonstrated for mixed models in the
peer-reviewed 
statistics literature.

Doug, can you elaborate on that last clause?  In what way is the
(absolute) 
ratio |T| informative that the monotonic transformation 2*(1 -
pnorm(abs(T)))
is not?  In other words, if a p-value (based in this case on a standard 
normal) is not reliable for inference, what inferential value does T
have?
Less formally, if T = 2, for example, what exactly do you conclude about

the parameter, and what is your confidence in that conclusion?
 
   [ ... ]

Yes, but the problem is that (as you noted earlier in your post) the
null 
distribution depends on the values of the nuisance parameters, except in

very special cases.  A simple parametric bootstrap conditions on the 
estimated values of those parameters, as if they were known.
Nevertheless, 
a function to do this might be a useful addition to lme4.

Rich Raubertas
Merck & Co.
Notice:  This e-mail message, together with any attachme...{{dropped:12}}

Teaching Mixed Effects

Thread (10 messages)