R2 for Negative Binomial calculated with GLMMADMB - R-SIG-mixed-models

Wed, Dec 17, 2014 7:18 AM #

Dear List-members,

recently, the R2 calculations for GLMMs invented by Schielzieth and 
Nakagawa 2012 [1] were implemented into the MuMIn package. This is 
incredibly good news, as many colleagues still require R2 to understand 
a model output. I invested 2 weeks in lengthy calculations of about 20 
negative binomial GLMMs using the glmmADMB package. Now, my colleagues 
want the R2 (me too), however, sadly, the MuMIn functions do only work 
for binomial and poisson GLMMS. Further, it seems that the functions do 
not recognize the glmmADMB package but prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy" to implement 
and if so "how"? I tried to redo the code provided here (actually posing 
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be 
implemented somewhere?

Is it possible to transform a GLMMADMB object into an lmer object?

Any hints are most welcome,

merry Xmas
Jens


[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method 
for obtaining R2 from generalized linear mixed-effects models./Methods 
in Ecology and Evolution/,/4/(2), 133-142.

+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail: 	jens.oldeland at uni-hamburg.de
         Oldeland at gmx.de 	
Skype:	jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++


	[[alternative HTML version deleted]]

Douglas Bates

Wed, Dec 17, 2014 8:26 AM #

<sermon>
I must admit to getting a little twitchy when people speak of the "R2 for
GLMMs".  R2 for a linear model is well-defined and has many desirable
properties.  For other models one can define different quantities that
reflect some but not all of these properties.  But this is not calculating
an R2 in the sense of obtaining a number having all the properties that the
R2 for linear models does.  Usually there are several different ways that
such a quantity could be defined.  Especially for GLMs and GLMMs before you
can define "proportion of response variance explained" you first need to
define what you mean by "response variance".  The whole point of GLMs and
GLMMs is that a simple sum of squares of deviations does not meaningfully
reflect the variability in the response because the variance of an
individual response depends on its mean.

Confusion about what constitutes R2 or degrees of freedom of any of the
other quantities associated with linear models as applied to other models
comes from confusing the formula with the concept.  Although formulas are
derived from models the derivation often involves quite sophisticated
mathematics.  To avoid a potentially confusing derivation and just "cut to
the chase" it is easier to present the formulas.  But the formula is not
the concept.  Generalizing a formula is not equivalent to generalizing the
concept.  And those formulas are almost never used in practice, especially
for generalized linear models, analysis of variance and random effects.  I
have a "meta-theorem" that the only quantity actually calculated according
to the formulas given in introductory texts is the sample mean.

It may seem that I am being a grumpy old man about this, and perhaps I am,
but the danger is that people expect an "R2-like" quantity to have all the
properties of an R2 for linear models.  It can't.  There is no way to
generalize all the properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for Ph.D.
candidacy.  The proposal was to examine I think 9 different formulas that
could be considered ways of computing an R2 for a nonlinear regression
model to decide which one was "best".  Of course, this would be done
through a simulation study with only a couple of different models and only
a few different sets of parameter values for each. My suggestion that this
was an entirely meaningless exercise was not greeted warmly.
</sermon>

On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland <fbda005 at uni-hamburg.de>
wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Kevin Wright

Wed, Dec 17, 2014 8:58 AM #

Doug wrote:

R> x <- c(1e-20, 1, -1)
R> (x[1] + x[2] + x[3])/3
[1] 0
R> (x[3] + x[2] + x[1])/3
[1] 3.333333e-21
R> mean(x)
[1] 0

Looks like he's right!

Kevin

Rolf Turner

Wed, Dec 17, 2014 12:14 PM #

On 18/12/14 04:18, Jens Oldeland wrote:

I would suggest that if your colleagues require R2 to "understand" the 
output from a glmm model, then they neither understand glmm models nor R2.

cheers,

Rolf Turner

Rolf Turner
Technical Editor ANZJS

Jens Oldeland

Wed, Dec 17, 2014 12:17 PM #

Dear Douglas,

many thanks for your thoughts. I understand that R2 is not perfectly  
correct for GLMs or anything more complicated. But still...

In my example, I calculated now these 20 negbin GLMMs and if anybody  
asks me how reliable they are, I cannot tell. According to the AIC  
thinking, I found the best of my candidate models, i.e. for each model  
I checked all possible parameter combinations in order to identify the  
"best" model (yes, there is no best model, and yes, searching a model  
using this procedure is for sure not optimal). I can calculate AIC  
weights which tell me how different my models are but not if the model  
is any good.

How can I know? Are there any possibilities to check this? Plotting  
observed versus predicted?

I mean, can I publish something without knowing this? I am an  
ecologist, so I am not perfectly trained in statistics and also not in  
assessing the quality of GLMMs.

Don?t worry, I am not in a bad mood while writing. just curious how  
this can be solved.

best regards from Hamburg, Germany
jens


Zitat von Douglas Bates <bates at stat.wisc.edu>:

<sermon>
I must admit to getting a little twitchy when people speak of the "R2 for
GLMMs".  R2 for a linear model is well-defined and has many desirable
properties.  For other models one can define different quantities that
reflect some but not all of these properties.  But this is not calculating
an R2 in the sense of obtaining a number having all the properties that the
R2 for linear models does.  Usually there are several different ways that
such a quantity could be defined.  Especially for GLMs and GLMMs before you
can define "proportion of response variance explained" you first need to
define what you mean by "response variance".  The whole point of GLMs and
GLMMs is that a simple sum of squares of deviations does not meaningfully
reflect the variability in the response because the variance of an
individual response depends on its mean.

Confusion about what constitutes R2 or degrees of freedom of any of the
other quantities associated with linear models as applied to other models
comes from confusing the formula with the concept.  Although formulas are
derived from models the derivation often involves quite sophisticated
mathematics.  To avoid a potentially confusing derivation and just "cut to
the chase" it is easier to present the formulas.  But the formula is not
the concept.  Generalizing a formula is not equivalent to generalizing the
concept.  And those formulas are almost never used in practice, especially
for generalized linear models, analysis of variance and random effects.  I
have a "meta-theorem" that the only quantity actually calculated according
to the formulas given in introductory texts is the sample mean.

It may seem that I am being a grumpy old man about this, and perhaps I am,
but the danger is that people expect an "R2-like" quantity to have all the
properties of an R2 for linear models.  It can't.  There is no way to
generalize all the properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for Ph.D.
candidacy.  The proposal was to examine I think 9 different formulas that
could be considered ways of computing an R2 for a nonlinear regression
model to decide which one was "best".  Of course, this would be done
through a simulation study with only a couple of different models and only
a few different sets of parameter values for each. My suggestion that this
was an entirely meaningless exercise was not greeted warmly.
</sermon>

On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland <fbda005 at uni-hamburg.de>
wrote:

Dear List-members,

recently, the R2 calculations for GLMMs invented by Schielzieth and
Nakagawa 2012 [1] were implemented into the MuMIn package. This is
incredibly good news, as many colleagues still require R2 to understand
a model output. I invested 2 weeks in lengthy calculations of about 20
negative binomial GLMMs using the glmmADMB package. Now, my colleagues
want the R2 (me too), however, sadly, the MuMIn functions do only work
for binomial and poisson GLMMS. Further, it seems that the functions do
not recognize the glmmADMB package but prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy" to implement
and if so "how"? I tried to redo the code provided here (actually posing
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-
squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be
implemented somewhere?

Is it possible to transform a GLMMADMB object into an lmer object?

Any hints are most welcome,

merry Xmas
Jens


[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method
for obtaining R2 from generalized linear mixed-effects models./Methods
in Ecology and Evolution/,/4/(2), 133-142.

--
+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail:   jens.oldeland at uni-hamburg.de
         Oldeland at gmx.de
Skype:  jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail: 	jens.oldeland at uni-hamburg.de
         Oldeland at gmx.de
Skype:	jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++

Ben Bolker

Wed, Dec 17, 2014 4:51 PM #

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 14-12-17 03:14 PM, Rolf Turner wrote:

I had a brief look at Schielzeth & Nakagawa 2012; they don't give
any immediately useful expression for the 'distribution-specific
variance' term sigma^2_d, and going back to Nakagawa and Schielzeth
2010 ("Repeatability for Gaussian and non-Gaussian data ...", cited by
SN2012 for distribution-specific variances) points further into the
weeds   as they say "There are other options, like negative binomial
models, that could also be considered (but are not treated here" and
refer the reader to papers by Carrasco 2009 and Carrasco and Jover
2005 ...  (Presumably the problem here is that the overdispersion in
the standard NB2 parameterization is neither additive nor
multiplicative.)  This probably wouldn't be too hard to work out in a
few hours of thought, but ...

  If you just need to make reviewers/colleagues happy, you could
always use the squared correlation coefficient between fitted and
observed values (I think Doug Bates has suggested this in the past).
This is certainly "an" R^2 measure, if not "the" R^2 measure.


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUkiT8AAoJEOCV5YRblxUHkrgIAJTV8Q4SA1c3qoYTg226y/im
ic55jODIjb+Sz4vwlYdV48LenzbMaJQ7pdTyERMRsqRXNiDhY72M+G9M2YoeHmso
wF6THLB6rAy3VqHHgIezYCvnsImwyD2fT8DH6Pn54qEb1y+HBw2ZPFUuJqTkfag+
RfoZBxHLEUNUlbiUtTcYOgumFwPTWA1bLdjiu4p3asbDHTvsBzilHcLNicFSV2fZ
7Cu2nocV2bVhTQlTEKtpnyilpqfRZ0FFA845Vrf7qgyisYBz9vMIxeIt6YbSEzQ+
nGRI9SUrr3O+LwPmUqQvYsffXyBN+GmA524UgfkuTa4myf0IVGqdVbKM2zdRh4M=
=fJWt
-----END PGP SIGNATURE-----

Shinichi Nakagawa

Thu, Dec 18, 2014 1:32 AM #

Dear Jens

Our proposed R2 is not 'the' R2 but is also an R2 for mixed models that has several of the useful properties of traditional R2 - actually first proposed by Snijders & Bosker (1994).

Let?s say NB(lambda, theta) with the log link ? the mean = lambda, and the variance = lambda+ lambda^2/theta

The level 1 variance (on the link scale) should be ln(1+1/lambda+1/theta): see the Appendix of our paper, Nakagawa & Schielzeth (2013)

For lambda, it is good to use mean(Y) (Y is the response; counts) and the package should give you the value of theta (also, one should use mean(Y) for Possion models). 

Here the level 1 variance, sigma^2_1= sigma^2_e (additive over-dispersion)+sigma^2_d (distribution specific) = sigma^2_epsilon (residual variance) as in our paper (2013).

But Holger and I are doing some simulation study to check this first before its use, and we think we can extend the proposed R2 to other distributions although we need to test a few things first (we should be ready in one month or so). 

Best wishes,

Shinichi

Shinichi Nakagawa, PhD
(Associate Professor of Behavioural Ecology)
Department of Zoology
University of Otago
340 Great King Street
P. O. Box 56
Dunedin, New Zealand
Tel:  +64-3-479-5046
Fax: +64-3-479-7584
http://sparrow.otago.ac.nz/

From: R-sig-mixed-models [r-sig-mixed-models-bounces at r-project.org] on behalf of Jens Oldeland [fbda005 at uni-hamburg.de]
Sent: Thursday, December 18, 2014 4:18 AM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] R2 for Negative Binomial calculated with GLMMADMB

Dear List-members,

recently, the R2 calculations for GLMMs invented by Schielzieth and
Nakagawa 2012 [1] were implemented into the MuMIn package. This is
incredibly good news, as many colleagues still require R2 to understand
a model output. I invested 2 weeks in lengthy calculations of about 20
negative binomial GLMMs using the glmmADMB package. Now, my colleagues
want the R2 (me too), however, sadly, the MuMIn functions do only work
for binomial and poisson GLMMS. Further, it seems that the functions do
not recognize the glmmADMB package but prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy" to implement
and if so "how"? I tried to redo the code provided here (actually posing
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be
implemented somewhere?

Is it possible to transform a GLMMADMB object into an lmer object?

Any hints are most welcome,

merry Xmas
Jens

[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method
for obtaining R2 from generalized linear mixed-effects models./Methods
in Ecology and Evolution/,/4/(2), 133-142.

--
+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel: 0049-(0)40-42816-407
Fax: 0049-(0)40-42816-543
Mail: jens.oldeland at uni-hamburg.de
Oldeland at gmx.de
Skype: jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++

[[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Simon Blomberg

Thu, Dec 18, 2014 1:48 PM #

I agree with Doug. R2 for anything other than an ordinary linear model is rearranging deck chair on the Titanic. GLMs and GLMMs are complicated. They can be wrong in a variety of ways and expecting a single number like R2 (however defined) is a poor way to assess the relative fit of a model. Pseudo R2s don't answer the same question as R2 for an OLS model anyway, as Doug pointed out. My approach would be to use posterior predictive tests in a Bayesian context, or perhaps cross-validation.

Cheers,

Simon.

Sent from my iPhone

On 19 Dec 2014, at 1:36 am, Jens Oldeland <fbda005 at uni-hamburg.de> wrote:

Dear Douglas,

many thanks for your thoughts. I understand that R2 is not perfectly correct for GLMs or anything more complicated. But still...

In my example, I calculated now these 20 negbin GLMMs and if anybody asks me how reliable they are, I cannot tell. According to the AIC thinking, I found the best of my candidate models, i.e. for each model I checked all possible parameter combinations in order to identify the "best" model (yes, there is no best model, and yes, searching a model using this procedure is for sure not optimal). I can calculate AIC weights which tell me how different my models are but not if the model is any good.

How can I know? Are there any possibilities to check this? Plotting observed versus predicted?

I mean, can I publish something without knowing this? I am an ecologist, so I am not perfectly trained in statistics and also not in assessing the quality of GLMMs.

Don?t worry, I am not in a bad mood while writing. just curious how this can be solved.

best regards from Hamburg, Germany
jens


Zitat von Douglas Bates <bates at stat.wisc.edu>:

<sermon>
I must admit to getting a little twitchy when people speak of the "R2 for
GLMMs".  R2 for a linear model is well-defined and has many desirable
properties.  For other models one can define different quantities that
reflect some but not all of these properties.  But this is not calculating
an R2 in the sense of obtaining a number having all the properties that the
R2 for linear models does.  Usually there are several different ways that
such a quantity could be defined.  Especially for GLMs and GLMMs before you
can define "proportion of response variance explained" you first need to
define what you mean by "response variance".  The whole point of GLMs and
GLMMs is that a simple sum of squares of deviations does not meaningfully
reflect the variability in the response because the variance of an
individual response depends on its mean.

Confusion about what constitutes R2 or degrees of freedom of any of the
other quantities associated with linear models as applied to other models
comes from confusing the formula with the concept.  Although formulas are
derived from models the derivation often involves quite sophisticated
mathematics.  To avoid a potentially confusing derivation and just "cut to
the chase" it is easier to present the formulas.  But the formula is not
the concept.  Generalizing a formula is not equivalent to generalizing the
concept.  And those formulas are almost never used in practice, especially
for generalized linear models, analysis of variance and random effects.  I
have a "meta-theorem" that the only quantity actually calculated according
to the formulas given in introductory texts is the sample mean.

It may seem that I am being a grumpy old man about this, and perhaps I am,
but the danger is that people expect an "R2-like" quantity to have all the
properties of an R2 for linear models.  It can't.  There is no way to
generalize all the properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for Ph.D.
candidacy.  The proposal was to examine I think 9 different formulas that
could be considered ways of computing an R2 for a nonlinear regression
model to decide which one was "best".  Of course, this would be done
through a simulation study with only a couple of different models and only
a few different sets of parameter values for each. My suggestion that this
was an entirely meaningless exercise was not greeted warmly.
</sermon>

On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland <fbda005 at uni-hamburg.de>
wrote:

Dear List-members,

recently, the R2 calculations for GLMMs invented by Schielzieth and
Nakagawa 2012 [1] were implemented into the MuMIn package. This is
incredibly good news, as many colleagues still require R2 to understand
a model output. I invested 2 weeks in lengthy calculations of about 20
negative binomial GLMMs using the glmmADMB package. Now, my colleagues
want the R2 (me too), however, sadly, the MuMIn functions do only work
for binomial and poisson GLMMS. Further, it seems that the functions do
not recognize the glmmADMB package but prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy" to implement
and if so "how"? I tried to redo the code provided here (actually posing
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-
squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be
implemented somewhere?

Is it possible to transform a GLMMADMB object into an lmer object?

Any hints are most welcome,

merry Xmas
Jens


[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method
for obtaining R2 from generalized linear mixed-effects models./Methods
in Ecology and Evolution/,/4/(2), 133-142.

--
+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail:   jens.oldeland at uni-hamburg.de
        Oldeland at gmx.de
Skype:  jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++


       [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Thu, Dec 18, 2014 6:05 PM #

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 14-12-18 04:48 PM, Simon Blomberg wrote:

I agree with this position, *but* I will say that if this is going
to be the case then we (the expert-y people) need to provide more
worked examples of how to do this.  There's at least one example of a
posterior predictive simulation at
http://www.rpubs.com/bbolker/glmmchapter ...

Sent from my iPhone

On 19 Dec 2014, at 1:36 am, Jens Oldeland
<fbda005 at uni-hamburg.de> wrote:

Dear Douglas,

many thanks for your thoughts. I understand that R2 is not 
perfectly correct for GLMs or anything more complicated. But 
still...

In my example, I calculated now these 20 negbin GLMMs and if 
anybody asks me how reliable they are, I cannot tell. According
to the AIC thinking, I found the best of my candidate models,
i.e. for each model I checked all possible parameter combinations
in order to identify the "best" model (yes, there is no best
model, and yes, searching a model using this procedure is for
sure not optimal). I can calculate AIC weights which tell me how
different my models are but not if the model is any good.

How can I know? Are there any possibilities to check this?
Plotting observed versus predicted?

I mean, can I publish something without knowing this? I am an 
ecologist, so I am not perfectly trained in statistics and also
not in assessing the quality of GLMMs.

Don?t worry, I am not in a bad mood while writing. just curious
how this can be solved.

best regards from Hamburg, Germany jens


Zitat von Douglas Bates <bates at stat.wisc.edu>:

<sermon> I must admit to getting a little twitchy when people 
speak of the "R2 for GLMMs".  R2 for a linear model is 
well-defined and has many desirable properties.  For other
models one can define different quantities that reflect some
but not all of these properties.  But this is not calculating
an R2 in the sense of obtaining a number having all the
properties that the R2 for linear models does.  Usually there
are several different ways that such a quantity could be
defined.  Especially for GLMs and GLMMs before you can define
"proportion of response variance explained" you first need to
define what you mean by "response variance".  The whole point
of GLMs and GLMMs is that a simple sum of squares of deviations
does not meaningfully reflect the variability in the response
because the variance of an individual response depends on its
mean.

Confusion about what constitutes R2 or degrees of freedom of
any of the other quantities associated with linear models as
applied to other models comes from confusing the formula with
the concept.  Although formulas are derived from models the 
derivation often involves quite sophisticated mathematics.  To 
avoid a potentially confusing derivation and just "cut to the 
chase" it is easier to present the formulas.  But the formula
is not the concept.  Generalizing a formula is not equivalent
to generalizing the concept.  And those formulas are almost
never used in practice, especially for generalized linear
models, analysis of variance and random effects.  I have a
"meta-theorem" that the only quantity actually calculated
according to the formulas given in introductory texts is the
sample mean.

It may seem that I am being a grumpy old man about this, and 
perhaps I am, but the danger is that people expect an
"R2-like" quantity to have all the properties of an R2 for
linear models. It can't.  There is no way to generalize all the
properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for 
Ph.D. candidacy.  The proposal was to examine I think 9
different formulas that could be considered ways of computing
an R2 for a nonlinear regression model to decide which one was
"best".  Of course, this would be done through a simulation
study with only a couple of different models and only a few
different sets of parameter values for each. My suggestion that
this was an entirely meaningless exercise was not greeted
warmly. </sermon>

On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland 
<fbda005 at uni-hamburg.de> wrote:

Dear List-members,

recently, the R2 calculations for GLMMs invented by
Schielzieth and Nakagawa 2012 [1] were implemented into the
MuMIn package. This is incredibly good news, as many
colleagues still require R2 to understand a model output. I
invested 2 weeks in lengthy calculations of about 20 negative
binomial GLMMs using the glmmADMB package. Now, my colleagues
want the R2 (me too), however, sadly, the MuMIn functions do
only work for binomial and poisson GLMMS. Further, it seems
that the functions do not recognize the glmmADMB package but
prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy"
to implement and if so "how"? I tried to redo the code
provided here (actually posing the same question) but
failed...: 
http://stats.stackexchange.com/questions/109215/r%C2%B2- 
squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be

+++++++++++++++++++++++++++++++++++++++++

_______________________________________________ 
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

+++++++++++++++++++++++++++++++++++++++++

_______________________________________________ 
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________ 
R-sig-mixed-models at r-project.org mailing list 
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)

iQEcBAEBAgAGBQJUk4fbAAoJEOCV5YRblxUHTEAH/jFKDykJtgnX1KCG576jI9RR
X3ZzvSJ94jABqknWEuYxT7RY25RixDsqAD0D7fet9hCwS7Pv9AZMcmRbGOa3twrW
OUFrBYEURt6Gk+WvyFEcffRFRRnktYDXjXzYiPyOOp22fmziCy6XvbkcMa8qc8M7
dG6HcJpygjGcpZBa+eBRBh7Oha3OTOaLIdjCRMk2b9OxmwivIO7YiHTmAYuLodo+
JjVuQJGi6TF+J/FUL3XGwfCECUtHu2+zJ3ch/NKzVv6OI0QZz72VMyViUrlIi/LJ
9S6AjGIdLhh7lndV/tq4qqKMw6jIVFqjJetYi1yr6fvju9v1Vc+KIu56T+rh6CA=
=uOST
-----END PGP SIGNATURE-----

Jens Oldeland

Fri, Dec 19, 2014 3:19 AM #

Thanks to all the repliers so far, and in particular thanks to Ben 
Bolker for the posterior predictive test example on RPubs! I did not 
work with Baysian stats so far, but I see that it is necessary, not only 
in the case of complex models but mostly there. Just yesterday, I dug 
through the 4 first chapters of the Introduction to WinBUGS... :)

Simple questions often provoke the most interesting discussions, isn?t 
it? :)
I learned something.

Thanks again and merry Xmas to all of you,

Jens


Am 18.12.2014 22:48, schrieb Simon Blomberg:

I agree with Doug. R2 for anything other than an ordinary linear model is rearranging deck chair on the Titanic. GLMs and GLMMs are complicated. They can be wrong in a variety of ways and expecting a single number like R2 (however defined) is a poor way to assess the relative fit of a model. Pseudo R2s don't answer the same question as R2 for an OLS model anyway, as Doug pointed out. My approach would be to use posterior predictive tests in a Bayesian context, or perhaps cross-validation.

Cheers,

Simon.

Sent from my iPhone

On 19 Dec 2014, at 1:36 am, Jens Oldeland <fbda005 at uni-hamburg.de> wrote:

Dear Douglas,

many thanks for your thoughts. I understand that R2 is not perfectly correct for GLMs or anything more complicated. But still...

In my example, I calculated now these 20 negbin GLMMs and if anybody asks me how reliable they are, I cannot tell. According to the AIC thinking, I found the best of my candidate models, i.e. for each model I checked all possible parameter combinations in order to identify the "best" model (yes, there is no best model, and yes, searching a model using this procedure is for sure not optimal). I can calculate AIC weights which tell me how different my models are but not if the model is any good.

How can I know? Are there any possibilities to check this? Plotting observed versus predicted?

I mean, can I publish something without knowing this? I am an ecologist, so I am not perfectly trained in statistics and also not in assessing the quality of GLMMs.

Don?t worry, I am not in a bad mood while writing. just curious how this can be solved.

best regards from Hamburg, Germany
jens


Zitat von Douglas Bates <bates at stat.wisc.edu>:

<sermon>
I must admit to getting a little twitchy when people speak of the "R2 for
GLMMs".  R2 for a linear model is well-defined and has many desirable
properties.  For other models one can define different quantities that
reflect some but not all of these properties.  But this is not calculating
an R2 in the sense of obtaining a number having all the properties that the
R2 for linear models does.  Usually there are several different ways that
such a quantity could be defined.  Especially for GLMs and GLMMs before you
can define "proportion of response variance explained" you first need to
define what you mean by "response variance".  The whole point of GLMs and
GLMMs is that a simple sum of squares of deviations does not meaningfully
reflect the variability in the response because the variance of an
individual response depends on its mean.

Confusion about what constitutes R2 or degrees of freedom of any of the
other quantities associated with linear models as applied to other models
comes from confusing the formula with the concept.  Although formulas are
derived from models the derivation often involves quite sophisticated
mathematics.  To avoid a potentially confusing derivation and just "cut to
the chase" it is easier to present the formulas.  But the formula is not
the concept.  Generalizing a formula is not equivalent to generalizing the
concept.  And those formulas are almost never used in practice, especially
for generalized linear models, analysis of variance and random effects.  I
have a "meta-theorem" that the only quantity actually calculated according
to the formulas given in introductory texts is the sample mean.

It may seem that I am being a grumpy old man about this, and perhaps I am,
but the danger is that people expect an "R2-like" quantity to have all the
properties of an R2 for linear models.  It can't.  There is no way to
generalize all the properties to a much more complicated model like a GLMM.

I was once on the committee reviewing a thesis proposal for Ph.D.
candidacy.  The proposal was to examine I think 9 different formulas that
could be considered ways of computing an R2 for a nonlinear regression
model to decide which one was "best".  Of course, this would be done
through a simulation study with only a couple of different models and only
a few different sets of parameter values for each. My suggestion that this
was an entirely meaningless exercise was not greeted warmly.
</sermon>

On Wed Dec 17 2014 at 9:49:28 AM Jens Oldeland <fbda005 at uni-hamburg.de>
wrote:

Dear List-members,

recently, the R2 calculations for GLMMs invented by Schielzieth and
Nakagawa 2012 [1] were implemented into the MuMIn package. This is
incredibly good news, as many colleagues still require R2 to understand
a model output. I invested 2 weeks in lengthy calculations of about 20
negative binomial GLMMs using the glmmADMB package. Now, my colleagues
want the R2 (me too), however, sadly, the MuMIn functions do only work
for binomial and poisson GLMMS. Further, it seems that the functions do
not recognize the glmmADMB package but prefer (g)lmer output.

Now my question: Does anybody of you know if this is "easy" to implement
and if so "how"? I tried to redo the code provided here (actually posing
the same question) but failed...:
http://stats.stackexchange.com/questions/109215/r%C2%B2-
squared-from-a-generalized-linear-mixed-effects-models-glmm-using-a-negat

Or does anybody know if in the near future (this year?) it will be
implemented somewhere?

Is it possible to transform a GLMMADMB object into an lmer object?

Any hints are most welcome,

merry Xmas
Jens


[1] Nakagawa, S., & Schielzeth, H. (2013). A general and simple method
for obtaining R2 from generalized linear mixed-effects models./Methods
in Ecology and Evolution/,/4/(2), 133-142.

--
+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail:   jens.oldeland at uni-hamburg.de
         Oldeland at gmx.de
Skype:  jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

+++++++++++++++++++++++++++++++++++++++++
Dr. Jens Oldeland

Post-Doc Researcher & Lecturer @ BEE
Managing Editor - Biodiversity & Ecology

Biodiversity, Ecology and Evolution of Plants (BEE)
Biocentre Klein Flottbek and Botanical Garden
University of Hamburg
Ohnhorststr. 18
22609 Hamburg,
Germany

Tel:    0049-(0)40-42816-407
Fax:    0049-(0)40-42816-543
Mail: 	jens.oldeland at uni-hamburg.de
         Oldeland at gmx.de 	
Skype:	jens.oldeland
http://www.biologie.uni-hamburg.de/bzf/fbda005/fbda005.htm
http://www.biodiversity-plants.de/biodivers_ecol/biodivers_ecol.php
+++++++++++++++++++++++++++++++++++++++++


	[[alternative HTML version deleted]]