Skip to content

model selection in lme4

7 messages · Ben Bolker, Christopher David Desjardins, Simon Blomberg +1 more

#
It would be better to use AICc, but I'm not sure what I would
use for "number of parameters" for a random effect with n
levels: any number between 0.5 and n seems plausible!
Someone should send Shane Richards (who has done some
very nice work testing (Q)AIC(c) in ecological settings)
and see if he's willing to tackle this one, although I can
imagine he's getting sick of this kind of exercise ...

  Ben Bolker
Renwick, A. R. wrote:
freedom?

  
    
#
For a discussion of BIC, please see Raftery (1995) in Sociological 
Methodology. Before you commit yourself on the AIC, I do encourage you to 
look at your BIC. In the models I've run when there is disagreement between 
the BIC and the AIC, it's usually that the AIC selects the overly complex 
model and includes unnecessary parameters.
Cheers,
Chris
On Sunday 15 February 2009 19:50:30 Ben Bolker wrote:
#
For a discussion of BIC, please see Raftery (1995) in Sociological 
Methodology. Before you commit yourself on the AIC, I do encourage you to 
look at your BIC. In the models I've run when there is disagreement between 
the BIC and the AIC, it's usually that the AIC selects the overly complex 
model and includes unnecessary parameters.
Cheers,
Chris
On Sunday 15 February 2009 19:50:30 Ben Bolker wrote:
#
Took a (very) quick look at Raftery, which all seems sensible
and well-argued.  However ... the paper contrasts Bayes/BIC
with classical hypothesis testing.  Many of the points listed on p. 155
(better assessment of evidence, applicability to non-nested models, take
model uncertainty into account, allow model averaging, easy to
implement) apply to AIC as well as BIC.  BIC does have many good
qualities (approximation to Bayes factor, sensible "flat prior"
interpretation, statistical consistency, ...).  But the crux of the
argument between BIC and AIC is the difference in their objective. BIC
aims to identify the "true model", which essentially assumes that there
is a sharp cutoff between parameters/processes that are in the model and
those that are out. Burnham and Anderson have a lot to say about
tapering effect sizes; they are zealots about AIC, and I often discount
their enthusiasm, but after much percolation I've decided that AIC
really does make sense for the kinds of questions I (and many
ecologists) tend to ask.

   When you say that AIC selects an overly complex model, how
do you know what the correct model is and which parameters are
unnecessary?  Is this a case of fitting to simulation output?
In that case I might bring up B&A's "tapering effects" argument
again -- selecting the correct model with a fixed number of parameters
with non-tapering effects is what BIC is for, not what AIC is for.

  I have tried to say this more coherently at
http://emdbolker.wikidot.com/blog:aic-vs-bic

  As an aside, I don't have a vested interest in this, and I don't
claim that AIC is better for everything ... just that it seems
most ecologists are working with "true models" that are of
arbitrarily large dimension with tapering effects, which is where
AIC should select the model with the best predictive capability ...

  Ben Bolker
Christopher David Desjardins wrote:

  
    
#
Vaida and Blanchard Biometrika [(2005), 92, 2, pp. 351?370 Conditional
Akaike information for mixed-effects models] discuss using AIC for model
selection in mixed-effects models, and make recommendations. There is
also a follow-up not by Liang, Wu and Zhou. Biometrika (2008), 95, 3,
pp. 773?778 A note on conditional AIC for linear mixed-effects models.

The general message is that the "type" of AIC statistic will depend on
your motivation for model selection. Is it the fixed effects part of the
model that is of most interest? Or are the random effects of specific
interest too? This "focus" will determine the number of "effective
parameters" in the penalty term (using results from Hodges, J.S. and
Sargent, D. J. (2001). Counting degrees of freedom in hierarchical and
other richly parameterized models. Biometrika 88, 367?79). There is also
the issue of REML v ML estimation...

Cheers,

Simon.
On Sun, 2009-02-15 at 20:23 -0600, Christopher David Desjardins wrote:
#
I think this may be the case. The data that I have used is real data not 
simulated. However, I can tell that the parameters were unnecessary as they 
didn't explain any variation above and beyond the more simpler models. I 
think it may also be a case of the type of research questions that I ask in 
psychology vs. ecology.

I'm always interested in knowing more about BIC/AIC and I'll check out your 
references.
Thanks!
Chris
On Sunday 15 February 2009 21:02:42 Ben Bolker wrote:
#
The issue seems to be what kind of generations one wishes to make.  
This determines what conditioning is appropriate, and it determines  
the distribution with respect to which one tries to find the  
expectation that is involved in calculating the AIC or other such  
statistic.  Should one condition wrt to, e.g., the actual numbers of  
plots at the different sites and the actual number of sites, as in the  
data?  Or should these be treated as random?   It all gets too  
horrible to contemplate.  Vaida and Blanchard, and the Liang & Wu &  
Zhou paper, do not do much more than scratch the surface of these  
complications.

The complications are of the same kind as those involved in  
calculating predicted values.  These differ depending on the  
population to which one wishes to generalize.  The SEs vary also, and  
depend on whether one wants the SE of the prediction, or the SE of the  
equivalent observation.  A focus on prediction may be the way to get a  
clear understanding of what should be optimized.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 16/02/2009, at 3:15 PM, Simon Blomberg wrote: