Skip to content
Prev 18207 / 20628 Next

Most principled reporting of mixed-effect model regression coefficients

Hi James,
The conditional R2 is usually higher, in case of random slopes probably much higher. You can try to compare your results with performance::r2_nakagawa(), piecewiseSEM::rsquared() and MuMIn::r.squaredGLMM() (see also next answer below).
In most cases, these packages yield similar to identical results. There are, however, certain situations where the one or other package might fit better to your need. In my experience, these functions differ in following points

- r2_nakagawa() only uses log-approximation, while rsquared() and r.squaredGLMM() also yield the results from the delta-approximation.
- all three functions yield consistent results for linear mixed models, including random slopes
- rsquared() fails for binomial models where the response is cbind(trials, success) or similar
- r2_nakagawa() and r.squaredGLMM() can handle models from package glmmTMB
- all three functions yield different results for "glmer(count ~ spp + mined + (1 | site), family = poisson, data = Salamanders)"
- currently, only r2_nakagawa() supports glmmTMB with zero-inflation
- r2_nakagawa() probably supports more packages (like GLMMadaptive or afex)
I personally would recommend to look at different aspects, not only "fit measures", but also which model makes theoretically more sense. Furthermore, it depends on what you're focussing. If you're interested in the variance of your random effects, i.e. how much variation do you have in your outcome depending on different groups/clusters/subjects, comparing the conditional and marginal R2 makes more sense than looking at AIC (you could also directly look at the ICC in such a situation, which goes into a similar direction as marginal & conditional R2). Also, AIC is most useful when comparing models, but it gives you less information about a certain model itself, while R2 is indeed a useful measure for a single model.

Best
Daniel

-----Urspr?ngliche Nachricht-----
Von: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org] Im Auftrag von Ades, James
Gesendet: Mittwoch, 26. Februar 2020 06:19
An: Maarten Jung <Maarten.Jung at mailbox.tu-dresden.de>
Cc: r-sig-mixed-models at r-project.org
Betreff: Re: [R-sig-ME] Most principled reporting of mixed-effect model regression coefficients

Thanks, Daniel and Maarten!

I looked at both Nakagawa and Schielzeth and the Johnson paper; I also looked through your other references...thanks for those. I really liked the linked Stack Exchange post of WHuber's lucid response to R^2.

Johnson references the MuMIn package, which I wasn't familiar with, though he writes that the function "r.squaredGLMM" takes into account the random slope (something that N & S mention as tedious and then wave aside). Using the N&S equation, for one of my models, I get an R^2 of .35, while using r.squaredGLMM, I get an R^2 of .43. I can't imagine that the random slope of time would make that big of a difference. (The conditional R^2 is .95, and I have no idea how it's that high). Does anyone have any experience with the package?

While some models (not for model selection but looking at PCA, individual variables, or some kind of aggregate measure for executive function)  have comparatively large differences in AIC; using R^2 via MuMIn, they might have differences of .01. In other words, what seemed to be decent (and significant with LRT) differences, with r.squaredGLMM they became inconsequential.

AIC seems to do a commendable job of yielding parsimony, but it's utter lack of comparability (with same ?# of observations) is frustrating. While an AIC of 28,620 is better than one with 28,645, there is, to my knowledge, no real way of quantifying that difference. Alas, while WHuber writes, "Most of the time you can find a better statistic than R^2. For model selection you  can look to AIC and BIC," I think the issue is not only in selecting models (which AIC seems to do quite well), but again, in summarizing those models in intuitively quantitative ways.

I've also looked into doing some kind of multiple time series cross validation though from what I've read (see below), this is similarly fraught. Maybe leave one out is the best way to go. The structure of the data has four timepoints with executive function data. The first two timepoints ('17 school year) and the final two timepoints ('18 school year) correspond to each year's standardized test.

Thanks much!



http://www.stat.columbia.edu/~gelman/research/published/final_sub.pdf
Di culty of selecting among multilevel models using predictive accuracy<http://www.stat.columbia.edu/~gelman/research/published/final_sub.pdf>
Statistics and Its Interface Volume 7 (2014) 1 Di culty of selecting among multilevel models using predictive accuracy Wei Wang and Andrew Gelman www.stat.columbia.edu

https://dl.acm.org/doi/10.1016/j.ins.2011.12.028
On the use of cross-validation for time series predictor evaluation | Information Sciences: an International Journal<https://dl.acm.org/doi/10.1016/j.ins.2011.12.028>
In time series predictor evaluation, we observe that with respect to the model selection procedure there is a gap between evaluation of traditional forecasting procedures, on the one hand, and evaluation of machine learning techniques on the other hand.
dl.acm.org


https://onlinelibrary.wiley.com/doi/full/10.1111/ecog.02881
[https://onlinelibrary.wiley.com/cms/asset/e2a4b565-bd18-485c-8dff-a03abb9a0c13/ecog.2017.v40.i8.cover.gif]<https://onlinelibrary.wiley.com/doi/full/10.1111/ecog.02881>
Cross?validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure - Roberts - 2017 - Ecography - Wiley Online Library<https://onlinelibrary.wiley.com/doi/full/10.1111/ecog.02881>
Ideally, model validation, selection, and predictive errors should be calculated using independent data (Ara?jo et al. 2005).For example, validation may be undertaken with data from different geographic regions or spatially distinct subsets of the region, different time periods, such as historic species records from the recent past or from fossil records.
onlinelibrary.wiley.com