Skip to content

R2 for Negative Binomial calculated with GLMMADMB

10 messages · Douglas Bates, Kevin Wright, Rolf Turner +4 more

#
Dear List-members,

recently, the R2 calculations for GLMMs invented by Schielzieth and 
Nakagawa 2012 [1] were implemented into the MuMIn package. This is 
incredibly good news, as many colleagues still require R2 to understand 
a model output. I invested 2 weeks in lengthy calculations of about 20 
negative binomial GLMMs using the glmmADMB package. Now, my colleagues 
want the R2 (me too), however, sadly, the MuMIn functions do only work 
for binomial and poisson GLMMS. Further, it seems that the functions do 
not recognize the glmmADMB package but prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy" to implement 
and if so "how"? I tried to redo the code provided here (actually posing 
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be 
implemented somewhere?

Is it possible to transform a GLMMADMB object into an lmer object?

Any hints are most welcome,

merry Xmas
Jens


[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method 
for obtaining R2 from generalized linear mixed-effects models./Methods 
in Ecology and Evolution/,/4/(2), 133-142.
#
<sermon>
I must admit to getting a little twitchy when people speak of the "R2 for
GLMMs".  R2 for a linear model is well-defined and has many desirable
properties.  For other models one can define different quantities that
reflect some but not all of these properties.  But this is not calculating
an R2 in the sense of obtaining a number having all the properties that the
R2 for linear models does.  Usually there are several different ways that
such a quantity could be defined.  Especially for GLMs and GLMMs before you
can define "proportion of response variance explained" you first need to
define what you mean by "response variance".  The whole point of GLMs and
GLMMs is that a simple sum of squares of deviations does not meaningfully
reflect the variability in the response because the variance of an
individual response depends on its mean.

Confusion about what constitutes R2 or degrees of freedom of any of the
other quantities associated with linear models as applied to other models
comes from confusing the formula with the concept.  Although formulas are
derived from models the derivation often involves quite sophisticated
mathematics.  To avoid a potentially confusing derivation and just "cut to
the chase" it is easier to present the formulas.  But the formula is not
the concept.  Generalizing a formula is not equivalent to generalizing the
concept.  And those formulas are almost never used in practice, especially
for generalized linear models, analysis of variance and random effects.  I
have a "meta-theorem" that the only quantity actually calculated according
to the formulas given in introductory texts is the sample mean.

It may seem that I am being a grumpy old man about this, and perhaps I am,
but the danger is that people expect an "R2-like" quantity to have all the
properties of an R2 for linear models.  It can't.  There is no way to
generalize all the properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for Ph.D.
candidacy.  The proposal was to examine I think 9 different formulas that
could be considered ways of computing an R2 for a nonlinear regression
model to decide which one was "best".  Of course, this would be done
through a simulation study with only a couple of different models and only
a few different sets of parameter values for each. My suggestion that this
was an entirely meaningless exercise was not greeted warmly.
</sermon>

On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland <fbda005 at uni-hamburg.de>
wrote:

  
  
#
Doug wrote:

            
R> x <- c(1e-20, 1, -1)
R> (x[1] + x[2] + x[3])/3
[1] 0
R> (x[3] + x[2] + x[1])/3
[1] 3.333333e-21
R> mean(x)
[1] 0

Looks like he's right!

Kevin
#
On 18/12/14 04:18, Jens Oldeland wrote:
I would suggest that if your colleagues require R2 to "understand" the 
output from a glmm model, then they neither understand glmm models nor R2.

cheers,

Rolf Turner
#
Dear Douglas,

many thanks for your thoughts. I understand that R2 is not perfectly  
correct for GLMs or anything more complicated. But still...

In my example, I calculated now these 20 negbin GLMMs and if anybody  
asks me how reliable they are, I cannot tell. According to the AIC  
thinking, I found the best of my candidate models, i.e. for each model  
I checked all possible parameter combinations in order to identify the  
"best" model (yes, there is no best model, and yes, searching a model  
using this procedure is for sure not optimal). I can calculate AIC  
weights which tell me how different my models are but not if the model  
is any good.

How can I know? Are there any possibilities to check this? Plotting  
observed versus predicted?

I mean, can I publish something without knowing this? I am an  
ecologist, so I am not perfectly trained in statistics and also not in  
assessing the quality of GLMMs.

Don?t worry, I am not in a bad mood while writing. just curious how  
this can be solved.

best regards from Hamburg, Germany
jens


Zitat von Douglas Bates <bates at stat.wisc.edu>:

  
    
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 14-12-17 03:14 PM, Rolf Turner wrote:
I had a brief look at Schielzeth & Nakagawa 2012; they don't give
any immediately useful expression for the 'distribution-specific
variance' term sigma^2_d, and going back to Nakagawa and Schielzeth
2010 ("Repeatability for Gaussian and non-Gaussian data ...", cited by
SN2012 for distribution-specific variances) points further into the
weeds   as they say "There are other options, like negative binomial
models, that could also be considered (but are not treated here" and
refer the reader to papers by Carrasco 2009 and Carrasco and Jover
2005 ...  (Presumably the problem here is that the overdispersion in
the standard NB2 parameterization is neither additive nor
multiplicative.)  This probably wouldn't be too hard to work out in a
few hours of thought, but ...

  If you just need to make reviewers/colleagues happy, you could
always use the squared correlation coefficient between fitted and
observed values (I think Doug Bates has suggested this in the past).
This is certainly "an" R^2 measure, if not "the" R^2 measure.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUkiT8AAoJEOCV5YRblxUHkrgIAJTV8Q4SA1c3qoYTg226y/im
ic55jODIjb+Sz4vwlYdV48LenzbMaJQ7pdTyERMRsqRXNiDhY72M+G9M2YoeHmso
wF6THLB6rAy3VqHHgIezYCvnsImwyD2fT8DH6Pn54qEb1y+HBw2ZPFUuJqTkfag+
RfoZBxHLEUNUlbiUtTcYOgumFwPTWA1bLdjiu4p3asbDHTvsBzilHcLNicFSV2fZ
7Cu2nocV2bVhTQlTEKtpnyilpqfRZ0FFA845Vrf7qgyisYBz9vMIxeIt6YbSEzQ+
nGRI9SUrr3O+LwPmUqQvYsffXyBN+GmA524UgfkuTa4myf0IVGqdVbKM2zdRh4M=
=fJWt
-----END PGP SIGNATURE-----
#
Dear Jens

Our proposed R2 is not 'the' R2 but is also an R2 for mixed models that has several of the useful properties of traditional R2 - actually first proposed by Snijders & Bosker (1994).

Let?s say NB(lambda, theta) with the log link ? the mean = lambda, and the variance = lambda+ lambda^2/theta

The level 1 variance (on the link scale) should be ln(1+1/lambda+1/theta): see the Appendix of our paper, Nakagawa & Schielzeth (2013)

For lambda, it is good to use mean(Y) (Y is the response; counts) and the package should give you the value of theta (also, one should use mean(Y) for Possion models). 

Here the level 1 variance, sigma^2_1= sigma^2_e (additive over-dispersion)+sigma^2_d (distribution specific) = sigma^2_epsilon (residual variance) as in our paper (2013).

But Holger and I are doing some simulation study to check this first before its use, and we think we can extend the proposed R2 to other distributions although we need to test a few things first (we should be ready in one month or so). 

Best wishes,

Shinichi

Shinichi Nakagawa, PhD
(Associate Professor of Behavioural Ecology)
Department of Zoology
University of Otago
340 Great King Street
P. O. Box 56
Dunedin, New Zealand
Tel:  +64-3-479-5046
Fax: +64-3-479-7584
http://sparrow.otago.ac.nz/
#
I agree with Doug. R2 for anything other than an ordinary linear model is rearranging deck chair on the Titanic. GLMs and GLMMs are complicated. They can be wrong in a variety of ways and expecting a single number like R2 (however defined) is a poor way to assess the relative fit of a model. Pseudo R2s don't answer the same question as R2 for an OLS model anyway, as Doug pointed out. My approach would be to use posterior predictive tests in a Bayesian context, or perhaps cross-validation.

Cheers,

Simon.

Sent from my iPhone
#
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On 14-12-18 04:48 PM, Simon Blomberg wrote:
I agree with this position, *but* I will say that if this is going
to be the case then we (the expert-y people) need to provide more
worked examples of how to do this.  There's at least one example of a
posterior predictive simulation at
http://www.rpubs.com/bbolker/glmmchapter ...
Or does anybody know if in the near future (this year?) it will be
+++++++++++++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++++++++++++++
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUk4fbAAoJEOCV5YRblxUHTEAH/jFKDykJtgnX1KCG576jI9RR
X3ZzvSJ94jABqknWEuYxT7RY25RixDsqAD0D7fet9hCwS7Pv9AZMcmRbGOa3twrW
OUFrBYEURt6Gk+WvyFEcffRFRRnktYDXjXzYiPyOOp22fmziCy6XvbkcMa8qc8M7
dG6HcJpygjGcpZBa+eBRBh7Oha3OTOaLIdjCRMk2b9OxmwivIO7YiHTmAYuLodo+
JjVuQJGi6TF+J/FUL3XGwfCECUtHu2+zJ3ch/NKzVv6OI0QZz72VMyViUrlIi/LJ
9S6AjGIdLhh7lndV/tq4qqKMw6jIVFqjJetYi1yr6fvju9v1Vc+KIu56T+rh6CA=
=uOST
-----END PGP SIGNATURE-----
#
Thanks to all the repliers so far, and in particular thanks to Ben 
Bolker for the posterior predictive test example on RPubs! I did not 
work with Baysian stats so far, but I see that it is necessary, not only 
in the case of complex models but mostly there. Just yesterday, I dug 
through the 4 first chapters of the Introduction to WinBUGS... :)

Simple questions often provoke the most interesting discussions, isn?t 
it? :)
I learned something.

Thanks again and merry Xmas to all of you,

Jens


Am 18.12.2014 22:48, schrieb Simon Blomberg: