nlme model specification
On Fri, 2008-05-23 at 13:41 -0700, Kingsford Jones wrote:
On Fri, May 23, 2008 at 12:05 PM, David Hewitt <dhewitt37 at gmail.com> wrote:
On Thu, May 22, 2008 at 5:55 PM, Caroline Lehmann: Models were compared and ranked using AICc. I would suggest modifying this to BIC since there are so many measurements.
Is there theory to support this suggestion? I find choosing an *IC to be a confusing issue and would appreciate any pointers to theory, simulations, etc that may shed some light on the subject.
Choosing a model selection method is indeed a confusing process. However, it does not simply depend on the number of measurements. Oversimplifying things, AIC/AICc are used for model selection following maximum likelihood model fitting and BIC (and Bayes factors) are used in a Bayesian context (likelihood + priors). The two approaches are different in a number of ways.
I don't think it is useful to put this in a Bayesian vs. frequentist framework. Burnham and Anderson write: "AIC can be justified as Bayesian using a 'savvy' prior on models that is a function of sample size and the number of model parameters Furthermore, BIC can be derived as a non-Bayesian result. Therefore, arguments about using AIC versus BIC for model selection cannot be from a Bayes versus frequentist perspective."
I've read that paper, and that part really annoyed me. Firstly, the "savvy" prior approach requires you to look at the data to establish the prior. In what sense is that then a prior distribution? Secondly, the "savvy" prior must have THE SAME precision as the likelihood. Why would we want a prior with this property? Surely the precision of the prior should reflect our precision of prior knowledge, and not be dependent on the data. (Also, using American slang for a concept does not necessarily make it statistically sound.) The fact that it requires you to contort Bayesian theory into infeasible knots in order to reconcile it with AIC model selection suggests to me that the two really aren't very compatible.
So NO, there is no theory specifically pointing to BIC and a Bayesian
strategy because there are "lots of measurements". However, there is more
theory than you (and I) care to know about regarding when to use a Bayesian
framework versus a "pure" likelihood framework. And, as in all academic
disputes, there is no clear consensus.
There's AIC/AICc, BIC, DIC, TIC, etc. and then simulation-based criteria as
well (about which I know zip). You can read Burnham and Anderson (2002) to
get their opinions about AIC and the information-theoretic strategy, and
among many other references I think EJ Wagenmakers sums up the Bayesian
perspective well in the 2007 paper listed here ("pratical solution to the
p-value problem"):
http://users.fmg.uva.nl/ewagenmakers/papers.html
It's a good read, even if you disagree with his conclusions about the
Bayesian strategy.
All that said, since you're dealing with random effects, Bayesian approaches
do appear to have the upper hand at present, and a shift in that direction
may be warranted.
Can you expound on the last paragraph? thank you, Kingsford Jones
----- David Hewitt Research Fishery Biologist USGS Klamath Falls Field Station (USA) -- View this message in context: http://www.nabble.com/nlme-model-specification-tp17375109p17433342.html Sent from the r-sig-ecology mailing list archive at Nabble.com.
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
_______________________________________________ R-sig-ecology mailing list R-sig-ecology at r-project.org https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
Simon Blomberg, BSc (Hons), PhD, MAppStat. Lecturer and Consultant Statistician Faculty of Biological and Chemical Sciences The University of Queensland St. Lucia Queensland 4072 Australia Room 320 Goddard Building (8) T: +61 7 3365 2506 http://www.uq.edu.au/~uqsblomb email: S.Blomberg1_at_uq.edu.au Policies: 1. I will NOT analyse your data for you. 2. Your deadline is your problem. The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. - John Tukey.