R2 measure in mixed models?
How does one try and summarize the "strength" of a fixed effect in the mixed model setting? It is this question that had led me to try and understand the various pseudo R-squares. I'm curious how others do this (for any definition of strength).
On Fri, Feb 26, 2010 at 10:27 AM, Nick Isaac <njbisaac at googlemail.com> wrote:
Thanks for these pertinent comments. I can't comment on the motivation for the original post. I have always felt that a single dimensionless Rsq was fairly meaningless in the context of mixed models. Gelman & Pardoe's formula summarizes the fit at each level in the model separately. This has more intuitive appeal, especially since I tend to fit models containing fixed effects at the group level. The motivation then would be to write a sentence along the lines of 'gender explains 5% of the among-subject variance in orthodontic growth curves; age explains 80% of the within-subject variation'. Incidentally, G&P also state that negative Rsqs might be expected (for their index): essentially it means that adding a fixed effect causes the variance of a random effect to increase.. Best wishes, Nick On 26 February 2010 14:30, Douglas Bates <bates at stat.wisc.edu> wrote:
On Fri, Feb 26, 2010 at 7:37 AM, Nick Isaac <njbisaac at googlemail.com> wrote:
Sorry to be joining this late.
I have written some code to implement Gelman & Pardoe's Rsq for an lmer object. It gives some believable results, but it's difficult to be
confident
because of the translation from Bayesian into frequentist paradigms.
If anyone is interested then I'd be really happy to discuss this off-list and share/develop the code.
Assuming that one wants to define an R^2 measure, I think an argument could be made for treating the penalized residual sum of squares from a linear mixed model in the same way that we consider the residual sum of squares from a linear model. ?Or one could use just the residual sum of squares without the penalty or the minimum residual sum of squares obtainable from a given set of terms, which corresponds to an infinite precision matrix. ?I don't know, really. ?It depends on what you are trying to characterize. In other words, what's the purpose? ?What aspect of the R^2 for a linear model are you trying to generalize? I'm sorry if I sound argumentative but discussions like this sometimes frustrate me. ?A linear mixed model does not behave exactly like a linear model without random effects so a measure that may be appropriate for the linear model does not necessarily generalize. ?I'm not saying that this is the case but if the request is "I don't care what the number means or if indeed it means anything at all, just give me a number I can report", that's not the style of statistics I practice. I regard Bill Venables' wonderful unpublished paper "Exegeses on Linear Models" (just put the name in a search engine to find a copy - there is only one paper with "Exegeses" and "Linear Models" in the title) as required reading for statisticians. ?As Bill emphasizes in that paper, statistics is not just a collection of formulas (many of which are based on approximations). ?It's about models and comparing how well different models fit the observed data. ?If we start with a formula and only ask ourselves "How do we generalize this formula?" we're missing the point. ?We should start at the model. In a linear model the R^2 statistic is a dimensionless comparison of the quality of the current model fit, as measured by the residual sum of squares, to the fit one would obtain from a trivial model. ?When the current model can be shown to contain a model with an intercept term only (and whose coefficient will be estimated by the mean response) then that model fit is the trivial model. ?Otherwise the trivial model is a prediction of zero for each response. ?We know that the trivial model will produce a greater residual sum of squares than the current model fit because the models are nested. ?The R^2 is the proportion of variability not accounted for by the trivial model but accounted for by the current model (my apologies to my grammar teachers for having juxtaposed prepositions). The interesting point there is that when you think of the relationships between models you can determine how you handle the case of a model that does not have an intercept term. ?If you start from the formula instead you can end up calculating a negative R^2 because you compare models that are not nested. ?Such nonsensical results are often reported. ?(I think it was the Mathematica documentation that gave a careful explanation of why you get a negative R^2 instead of recognizing that the formula they were using did not apply in certain cases.) It may be that there is a sensible measure of the quality of fit from a linear mixed model that generalizes the R^2 from a linear model. ?I don't see an obvious candidate but I will freely admit that I haven't thought much about the problem. ?I would ask others who are thinking about this to consider both the "what" and the "why". ?George Mallory's justification of "because it's there" for attempting to climb Everest is perhaps a good justification for such endeavors (Mallory may have questioned his rationale as he lay freezing to death on the mountain). ?I don't think it is a good justification for manipulating formulas.
Best wishes, Nick On 18 February 2010 06:55, Luisa Carvalheiro <lgcarvalheiro at gmail.com wrote:
Hi Steve, Thanlks for reply and literature list. Here are 3 papers on ?R2 calculations for Mixed Models: M. Mittlbock, T. Waldhor. Adjustments for R2-measures for Poisson regression models. Computational Statistics & Data Analysis 34 (2000) 461-472 M. Mittlbock Calculating adjusted R2 measures for Poisson regression Models. Computer Methods and Programs in Biomedicine 68 (2002) 205?214 H. Liu,Y. Zheng and J. Shen. Goodness-of-fit measures of R2 for repeated measures mixed effect models Journal of Applied Statistics. 35, 2008, 1081?1092 On Thu, Feb 18, 2010 at 1:46 AM, Steven J. Pierce <pierces1 at msu.edu> wrote:
Luisa, I'm not aware of any packages for that, but I'd like the full citation
for
the paper you mentioned. In exchange, here are some citations for
other
papers about R-square measures in multilevel models that I've found. Edwards, L. J., Muller, K. E., Wolfinger, R. D., Qaqish, B. F., & Schabenberger, O. (2008). An R2 statistic for fixed effects in the
linear
mixed model. Statistics in Medicine, 27(29), 6137-6157. doi: 10.1002/sim.3429 Gelman, A., & Pardoe, I. (2006). Bayesian measures of explained
variance
and
pooling in multilevel (hierarchical) models. Technometrics, 48(2),
241-251.
doi: 10.1198/004017005000000517 Kramer, M. (2005). R2 statistics for mixed models. Proceedings of the Conference on Applied Statistics in Agriculture, 17, 148-160.
Retrieved
from
SupplRsq.pdf Merlo, J., Yang, M., Chaix, B., Lynch, J., & R?stam, L. (2005). A
brief
conceptual tutorial on multilevel analysis in social epidemiology: investigating contextual phenomena in different groups of people.
Journal
of
Epidemiology and Community Health, 59(9), 729-736. doi: 10.1136/jech.2004.023929 Orelien, J. G., & Edwards, L. J. (2008). Fixed-effect variable
selection
in
linear mixed models using R2 statistics. Computational Statistics &
Data
Analysis, 52(4), 1896-1907. doi: 10.1016/j.csda.2007.06.006 Roberts, J. K., & Monaco, J. P. (2006, April). Effect size measures
for
the
two-level linear multilevel model. ?Paper presented at the annual
meeting
of
the American Educational Research Association, San Francisco, CA.
Retrieved
from http://www.hlm-online.com/papers/HLM_effect_size.pdf Snijders, T. A. B., & Bosker, R. J. (1994). Modeled variance in
two-level
models. Sociological Methods & Research, 22(3), 342-363. doi: 10.1177/0049124194022003004 Snijders, T. A. B., & Bosker, R. J. (1999). Multilevel analysis.
London,
UK:
Sage. Xu, R. (2003). Measuring explained variation in linear mixed effects
models.
Statistics in Medicine, 22(22), 3527-3541. doi: 10.1002/sim.1572 Steven J. Pierce Associate Director Center for Statistical Training & Consulting (CSTAT) Michigan State University 178 Giltner Hall East Lansing, MI 48824 Web: http://www.cstat.msu.edu -----Original Message----- From: Luisa Carvalheiro [mailto:lgcarvalheiro at gmail.com] Sent: Wednesday, February 17, 2010 6:00 AM To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] R2 measure in mixed models? Dear mixed modelers, Is there any package for calculating R2 measures for mixed models in R
(e.g.
using the measure proposed by Mittlb ock; ?Waldh or 2000)? Luisa
-- Luisa Carvalheiro, PhD Southern African Biodiversity Institute, Kirstenbosch Research Center, Claremont & University of Pretoria Postal address - SAWC Pbag X3015 Hoedspruit 1380, South Africa telephone - +27 (0) 790250944 Carvalheiro at sanbi.org lgcarvalheiro at gmail.com
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
? ? ? ?[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
? ? ? ?[[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models