Skip to content

glmm AIC/LogLik reliability

5 messages · D O S Gillespie, Virgilio Gomez Rubio, Andrew Beckerman +1 more

#
Dear R-Sig-ME -

Lets assume that I am going to use a model averaging AIC based  
approach to evaluate nested glmm's.

I would like to assume that the estimation of AIC and LogLik in the  
glmm's of lmer are consistent enough (precise, if not accurate) to use  
in this framework. I realize that we don't trust anova(m1, m2), mainly  
due to df and tests statistics issues.

I realise that some of you may suggest that this is not the correct  
framework.  If so, can you distinguish arguments about the philosophy  
of AIC model averaging from the practical implementation - i.e. is the  
output consistent enough to use if, even if you don't believe the  
answer.  Perhaps they are too intertwined.

Thanks,

Duncan Gillespie
#
I would argue that there's very little we *can* trust
in the realm of GLMM inference, with the exception
of randomization/parametric bootstrapping (and possibly
Bayesian) approaches.

   I think AIC is no worse than anything else in this regard,
except that it hasn't been explored as carefully
as some of the alternatives: thus we suspect by analogy
that there are problems similar to those of the LRT,
but we don't know for sure.
Vaida and Blanchard (2005), Greven (2008), and Burnham
and White (2002) are good references.  There are
two basic issues:
  (1) if you choose to include models that differ
in their random effects components, how do you count
"effective" degrees of freedom?
  (2) how big a sample does it take to reach the
"asymptopia" of AIC?  If you're not there, what is
the best strategy for finite-size correction?  If
you use AICc, what should you put in for effective
residual degrees of freedom?

   Ben Bolker
D O S Gillespie wrote:

  
    
#
Hi,
I would also point to the paper by Spiegelhalter et al. (2002) on the
DIC. It is a 'Bayesian version' of the DIC but the examples and
discussions therein are quite interesting.
We are trying to make a comparison of AIC, cAIC (Vaida and Blanchard,
2005) and DIC in this working paper:

http://www.bias-project.org.uk/papers/ComparisonSAE.pdf

I believe it is a bit of an unfinished work but we have computed several
linear (mixed) models in the context of Small Area Estimation and we
display the values of AIC/cAIC/DIC in a table for comparison purposes
together with the penalty terms. The aim is to study up to what point
the AIC, cAIC and DIC are comparable using different structures for the
random effects. Any comments are welcome.

Hope this helps.

Virgilio

P.S: Is there any way of obtaining the design matrix of the random
effects and the matrix of the variance from an lme object. That would
help to compute the cAIC more easily.
#
Perhaps the question was not clear enough (I helped Duncan try and  
articulate this....)

Lets assume that we maintain random effects structure in all models,  
but we have a large multiple regression problem in the fixed effects  
(say 8 variables potentially affecting reproduction in a  population).

Can we assume that the LogLik calculations work in this instance?

If we can say yes to this, then we can assume that some calculation of  
AIC is possible. The adjustement of the LogLik by # of paramters can  
be manipulated by the researcher, deciding on what df means to him or  
her, etc.  The crux of the questions is not whether inference is  
correct, but whether the bits/mechanics about getting an AIC value for  
a set of nested models with the same random effects are internally  
consistent.

Andrew
On 28 Jan 2009, at 19:11, Ben Bolker wrote:

            
1 day later
#
Andrew Beckerman wrote:
Maintaining the random effects structure takes care of the issue
of counting degrees of freedom for random effects, EXCEPT in the
finite-data (AICc or equivalent) case.
I would guess that you would get correct log-likelihoods/deviances
in this case, if you use ML rather than REML.  (These will essentially
be marginal deviances, integrated over the random effects.)
If you're not worried about inference, then I'd say you're OK.
Likelihood/deviance should correctly rank models with the same degree
of complexity.  But I don't see how you're going to be able to
confidently rank models unless (a) your Ns are so large
that you can assert that you are in "asymptopia" (and N here means
both (?) number of random-effects units and total sample size)
or (b) you can figure out how to inflate penalties based on
"residual df" ...

  As always, I'm happy to be corrected.

  [blatant plug: I have a GLMM paper available online now
<http://dx.doi.org/10.1016/j.tree.2008.10.008> although much of what
it says will be well known to everyone here ...]

  Ben Bolker