Most principled reporting of mixed-effect model regression coefficients
Thanks, Maarten. So I was planning on reporting R^2 (along with AIC) for the overall model fit, not for each predictor, since the regression coefficients themselves give a good indication of relationship (though I wasn't aware that R^2 is "riddled with complications") Is Henrik only saying this only with regard to LMMs and GLMMs?
That makes sense to me. For the overall model fit I would probably still go with Johnson's version [1] which I describe in my StackExchange post (and I think you mentioned it, or the Nakagawa and Schielzeth version it is based on, earlier) and report both the marginal and conditional R^2 values. The regression coefficients provide unstandardized effect sizes on the response scale which I think are a valid way to report effect sizes (see below). I think Henrik refers to (G)LMMs and gives Rights & Sterba (2019) [2] as reference. Also, the GLMM FAQ website provides a good overview [3].
When you say "there is no agreed upon way to calculate effect sizes" I'm a little confused. I read through your stack exchange posting, but Henrik's answer refers to standardized effect size. You write, later down, "Whenever possible, we report unstandardized effect sizes which is in line with general recommendation of how to report effect sizes"
What you cite is still Henrik's opinion (and I hoped that I could make this clear by writing "This is what he suggests [...]" and by using the <blockquote> on StackExchange). And your citation still refers to LMMs as he says "Unfortunately, due to the way that variance is partitioned in linear mixed models (e.g., Rights & Sterba, 2019), there does not exist an agreed upon way to calculate standard effect sizes for individual model terms such as main effects or interactions." In general, I agree with him and with his recommendation to report unstandardized effect sizes (e.g. regression coefficients) if they have a "meaningful" interpretation. The semi-partial R^2 I mentioned in my last e-mail is an additional/alternative indicator of effect sizes that is probably more in line with what psychologists are used to see reported in papers (especially when results of factorial designs are reported) - and that's the reason I mentioned it.
I'm also working on a systematic review where there's disagreement over whether effect sizes should be standardized, but it does seem that yield any kind of meaningful comparison, effect sizes would have to be standardized. I don't usually report standardized effect sizes...however, there are times when I z-score IVs to put them on the same scale, and I guess the output of that would be a standardized effect size. I wasn't aware of push back on that practice. What issues would arise from this?
There is nothing wrong with standardizing (e.g. by diving by 1 or 2 standard deviations) predictor variables to get measures of variable importance (within the same model). Issues arise when standardized effect sizes such as R^2, partial eta^2, etc. between different models are compared without thinking about what differences in these measures can be attributed to (see e.g. this question [4] or the Pek & Flora (2018) paper [5] that Henrik cites). Note that these are general issues that apply to all regression models, not only mixed models. [1] https://doi.org/10.1111/2041-210X.12225 [2] https://doi.org/10.1037/met0000184 [3] https://bbolker.github.io/mixedmodels-misc/glmmFAQ.html#how-do-i-compute-a-coefficient-of-determination-r2-or-an-analogue-for-glmms [4] https://stats.stackexchange.com/questions/13314/is-r2-useful-or-dangerous/13317 [5] https://doi.org/10.1037/met0000126 Best, Maarten