Skip to content
Prev 5386 / 20628 Next

Interpretation of lmer output in R

On Mon, Feb 21, 2011 at 8:26 AM, Julia Sommerfeld
<Julia.Sommerfeld at utas.edu.au> wrote:
So I would conclude that even though sex was recorded it turns out
that it is not a significant predictor of the probability of site
fidelity and I would omit that term from the model.  As Ben described,
he would not be in favor of this approach because it verges on "data
snooping".  One can argue either way and, in this case it wouldn't
make much difference in the final conclusion whether or not the Sex
term is included.
So if I needed to quote a p-value for the BreedSuc factor this is what
I would quote.
The alternative is to fit a model of the form

fm2a <- lmer(SameSite ~ 1 + Sex + (1|Bird), family="binomial")

and compare it to the original model, fm, using

anova(fm2a, fm)

The general idea of testing whether BreedSuc makes a significant
contribution to predicting the probability of site fidelity is to fit
a model with the term and then fit the model without the term and
compare the quality of the fits.  To me the most sensible way to
compare the quality of the fits is to consider the likelihood ratio.
The model with the term will always do better than the one without the
term - the question is, "Is it significantly better?".  One way to
answer that question is to convert the likelihood ratio test (LRT)
statistic to a probability or p-value using the result that, under the
null hypothesis (that the term does not make a significant
contribution) the LRT statistic has a chi-squared distribution with 1
degree of freedom.  One can set up other criteria; for example AIC
penalizes each parameter as 2 units on the deviance scale (negative
twice the log-likelihood).  BIC is a bit more complicated in that the
number of units of penalty per parameter on the deviance scale depends
on the number of observations in the data set.

I would claim that the LRT statistic is always a good way of
evaluating the difference in the quality of fit for two models - it is
how you convert it to a p-value that is not clear when you have small
sample sizes.

The difference between what I would advise and what Ben would advise
regarding the LRT is what the null and alternative models are.  I
would remove the Sex term from both.  He would retain the Sex term in
both.  This will result in slightly different conclusions.

This, by the way, emphasizes the point that a test statistic and its
corresponding p-value is not a property of the BreedSuc term.  It
results from comparing the quality of fits of two models - one with
the term and one without.  When we quote t- or z-statistics, and
p-values, in a coefficients table we are providing a summary of many
different types of tests simultaneously.  Unfortunately the conclusion
that is often drawn from the table is that the p-value is a property
of the term itself, which is wrong.
Perhaps I misunderstood your original posting.  I thought that
SameSite=0 meant that the bird did not return to the same nest site.
That is, site fidelity corresponds to SameSite = 1.

In any case the probabilities constructed as you have done are the
probabilities for SameSite = 1.
Then the 0's and 1's would be reversed for that response variable and
your probabilities would be the complement (i.e. 1-p instead of p) of
those calculated above.
I may have been too terse in my explanations.  As mentioned above, I
would claim that the LRT statistic is a reasonable way to compare the
fits of two models, because it is based upon fitting the model with
and without the term of interest.  In the case of a linear model
without random effects it is not necessary to fit the model without
the term just to discover what the LRT statistic would be.  You can
tell from the model fit with the term what the maximum value for the
likelihood of any sub-model will be.

In the case of a linear mixed model or a generalized linear mixed
model you can't decide on the basis of the one model fit what the
likelihood for the other will be.  You can approximate but you you
don't get an exact value.  When it took a very long time to do a model
fit we just used the approximation.  Now that these fits can be done
much more quickly, it makes sense to fit both with and without.

The nature of the approximation is to take the parameter estimate for
BreedSuc and divide it by its approximate standard error.  We call
this the z-statistic because, when everything is working properly,
this should have a distribution close to a standard normal, which we
often write as Z.  The LRT statistic is the difference in the deviance
of the model without and the model with the term.  To me, that is the
quantity of interest and the fact that it should be approximately the
square of the z-statistic is helpful in making rough decisions but I
still want to calculate the difference in the deviance before making a
final decision.

In the summary of fm1 the z-statistic is 1.998 whereas the LRT
statistic comparing fm2 to fm1 is 4.0991.  The square of the
z-statistic will be close to, but not exactly the same as, the LRT
statistic.
You're welcome.  Thanks for the question.