same old question - lme4 and p-values - R-SIG-mixed-models

Tue, Apr 15, 2008 5:53 AM #

Thanks for this pointer Ben.  Too bad the wiki is still down. :-(

I was able to retrieve a cached page from a Google search.
I think (hope) this will do the trick.

One more question.  Would there be an "official" citation to
this information appropriate as a reference in the manuscript?

Ben Bolker wrote:

  Also note that in the long thread on the R wiki
(wiki.r-project.org, search for "bates mixed" or some such --
I can't get through to it right now) DB suggests an
test for a composite hypothesis a_1=a_2=...=a_n=0
along with R code to do it ...

Andrew Robinson wrote:

On Sat, Apr 12, 2008 at 02:02:09PM +0200, Reinhold Kliegl wrote:

On Fri, Apr 11, 2008 at 3:10 PM, Kevin E. Thorpe
<kevin.thorpe at utoronto.ca> wrote:

This has been a very interesting thread.  However, I'm still
 wrestling with what to do for a fixed-effect that has more than
 one degree of freedom.

 In the data I'm analyzing, I have three groups to compare.

 So, I can get CIs for the two parameters, but that is a bit
 problematic for assessing an overall difference.

 Is it valid to do the following?  Estimate the parameters using both
 ML and REML.  If the estimates show good agreement, is that sufficient
 evidence to conclude the ML procedure is converging and that I can
 use a likelihood ratio test for the fixed effect?

I assume you refer to using anova(fm1, fm2) with fm1 fitting the model
without the fixed effect. This a comparison of nested models, so a
likelihood ratio test can be defined for ML fits only. Note, however,
that Pinheiro & Bates (2000, p. 87, 2.4.2) "do not recommend using
such tests"; "not" is set in bold face. They show that such tests tend
to be anti-conservative, especially if the number of parameters
removed is large relative to the number of observations. Assuming you
have a decent number of total observations, you may be fine.
Alternatively, you may want to run a simulation for your situation;
you will also find R-code examples in the P&B section.

I agree with Reinhold's position, here.  I also note in passing that
Doug uses this strategy to test the fixed effects in the cake data
(see ?cake).  Doug, does the cake data analysis represent a softening
on your position or a place-filler?

My first reaction to your email was: Why is he only interested in the
overall effect of a fixed factor and not in specific comparisons
between its levels? After Andrew's comment to an earlier post, I
understand that there are such situations where you just want to
control for an aspect of the design that is not in the focus of your
theoretical concerns (e.g., in ecology you may have three sites that
could be characterized as levels of a fixed factor or as a sample from
a random factor). Perhaps  your fixed factor may also be better
conceptualized as a random factor. In a way, you just want to control
for the variance contributed by this factor. If this applies to your
data, then you may be better off to specify your fixed factor as a
random factor. Then, your anova(fm1, fm2) compares nested models that
differ only in the random-effects part. In this case the likelihood
ratio test can be used with models fit by REML. These tests tend to be
conservative (Pinheiro & Bates, 2000, p. 2.4.1; following up on Stram
& Lee, 1994). So if your ANOVA statistic is significant, you are on
the save side; if not, you do not know. Also keep in mind, that random
effects with few units may generate problems for model convergence.

That's an interesting idea, even if the interpretation is intended to
be a fixed factor.  It might work to a certain order of approximation,
but I'm not clear how the math would play out.  Some simulations might
provide a measure of comfort in individual situations.

Best wishes,

Andrew

Kevin E. Thorpe
Biostatistician/Trialist, Knowledge Translation Program
Assistant Professor, Department of Public Health Sciences
Faculty of Medicine, University of Toronto
email: kevin.thorpe at utoronto.ca  Tel: 416.864.5776  Fax: 416.864.6057