I agree with Doug. R2 for anything other than an ordinary linear model is rearranging deck chair on the Titanic. GLMs and GLMMs are complicated. They can be wrong in a variety of ways and expecting a single number like R2 (however defined) is a poor way to assess the relative fit of a model. Pseudo R2s don't answer the same question as R2 for an OLS model anyway, as Doug pointed out. My approach would be to use posterior predictive tests in a Bayesian context, or perhaps cross-validation.
Cheers,
Simon.
Sent from my iPhone
On 19 Dec 2014, at 1:36 am, Jens Oldeland <fbda005 at uni-hamburg.de> wrote:
Dear Douglas,
many thanks for your thoughts. I understand that R2 is not perfectly correct for GLMs or anything more complicated. But still...
In my example, I calculated now these 20 negbin GLMMs and if anybody asks me how reliable they are, I cannot tell. According to the AIC thinking, I found the best of my candidate models, i.e. for each model I checked all possible parameter combinations in order to identify the "best" model (yes, there is no best model, and yes, searching a model using this procedure is for sure not optimal). I can calculate AIC weights which tell me how different my models are but not if the model is any good.
How can I know? Are there any possibilities to check this? Plotting observed versus predicted?
I mean, can I publish something without knowing this? I am an ecologist, so I am not perfectly trained in statistics and also not in assessing the quality of GLMMs.
Don?t worry, I am not in a bad mood while writing. just curious how this can be solved.
best regards from Hamburg, Germany
jens
Zitat von Douglas Bates <bates at stat.wisc.edu>:
<sermon>
I must admit to getting a little twitchy when people speak of the "R2 for
GLMMs". R2 for a linear model is well-defined and has many desirable
properties. For other models one can define different quantities that
reflect some but not all of these properties. But this is not calculating
an R2 in the sense of obtaining a number having all the properties that the
R2 for linear models does. Usually there are several different ways that
such a quantity could be defined. Especially for GLMs and GLMMs before you
can define "proportion of response variance explained" you first need to
define what you mean by "response variance". The whole point of GLMs and
GLMMs is that a simple sum of squares of deviations does not meaningfully
reflect the variability in the response because the variance of an
individual response depends on its mean.
Confusion about what constitutes R2 or degrees of freedom of any of the
other quantities associated with linear models as applied to other models
comes from confusing the formula with the concept. Although formulas are
derived from models the derivation often involves quite sophisticated
mathematics. To avoid a potentially confusing derivation and just "cut to
the chase" it is easier to present the formulas. But the formula is not
the concept. Generalizing a formula is not equivalent to generalizing the
concept. And those formulas are almost never used in practice, especially
for generalized linear models, analysis of variance and random effects. I
have a "meta-theorem" that the only quantity actually calculated according
to the formulas given in introductory texts is the sample mean.
It may seem that I am being a grumpy old man about this, and perhaps I am,
but the danger is that people expect an "R2-like" quantity to have all the
properties of an R2 for linear models. It can't. There is no way to
generalize all the properties to a much more complicated model like a GLMM.
I was once on the committee reviewing a thesis proposal for Ph.D.
candidacy. The proposal was to examine I think 9 different formulas that
could be considered ways of computing an R2 for a nonlinear regression
model to decide which one was "best". Of course, this would be done
through a simulation study with only a couple of different models and only
a few different sets of parameter values for each. My suggestion that this
was an entirely meaningless exercise was not greeted warmly.
</sermon>
On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland <fbda005 at uni-hamburg.de>
wrote:
Dear List-members,
recently, the R2 calculations for GLMMs invented by Schielzieth and
Nakagawa 2012 [1] were implemented into the MuMIn package. This is
incredibly good news, as many colleagues still require R2 to understand
a model output. I invested 2 weeks in lengthy calculations of about 20
negative binomial GLMMs using the glmmADMB package. Now, my colleagues
want the R2 (me too), however, sadly, the MuMIn functions do only work
for binomial and poisson GLMMS. Further, it seems that the functions do
not recognize the glmmADMB package but prefer (g)lmer output.
Now my question: Does anybody of you know if this is "easy" to implement
and if so "how"? I tried to redo the code provided here (actually posing
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-
squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat
Or does anybody know if in the near future (this year?) it will be
implemented somewhere?
Is it possible to transform a GLMMADMB object into an lmer object?
Any hints are most welcome,
merry Xmas
Jens
[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method
for obtaining R2 from generalized linear mixed-effects models./Methods
in Ecology and Evolution/,/4/(2), 133-142.
--
+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland
Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology
Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany
Tel: 0049-(0)40-42816-407
Fax: 0049-(0)40-42816-543
Mail: jens.oldeland at uni-hamburg.de
Oldeland at gmx.de
Skype: jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++
[[alternative HTML version deleted]]