How to fix a gamma model with poor fit?
A few quick comments: 1. The QQ-plot diagnostic needs to be against the appropriate distribution. For a gamma model, the theoretical quantiles comes from the gamma distribution, not the normal distribution. 2. You have a log link but your reaction-time data looks to be in seconds, not milliseconds. Note that log10(0.1) = -1 and log10(1) = 0, but log10(100) = 2 and log10(1000) = 3, so you'll get very different answers for seconds vs. milliseconds. The reason why log transforms are so nice for reaction times is not "just" the skew, but rather that there is an underlying power law driving the effects *on the milliseconds* scale. Are you using a gamma model because of the Lo and Andrews paper? I've indicated my skepticism about that work previously, but these are my critical points: - there's still a transformation going on, it's just in the link function - having a nonlinear link complicates interpretation of the coefficients - gamma models are much harder to fit numerically (and I believe that codepath is less well tested in lme4; it's a known problem in MixedModels.jl) - the usual tests on residuals, etc. now have to be done against a gamma distribution, not a normal, but a lot of diagnostics use the normal by default - I don't understand the obsession with "satisfying normality assumptions" (from their abstract) in a GLMM -- half the point of the *generalized* bit is that you can swap in a different distributional assumption (the other half is the use of a link function) When thinking about using a model from a particular family/distribution, note that the distributional assumption is *on the residuals* so the skew in your raw data may or may not be present in the residuals. So maybe you don't need a Gamma model at all. Looking at your model output, it looks like you're using sum contrasts -- great! But checkout contr.Sum from the car package for nicer looking labels. :) Phillip
On 13/7/21 6:18 pm, C?tia Ferreira De Oliveira via R-sig-mixed-models wrote:
Dear all, I am sorry for reposting here after posting on cross validated here <https://stats.stackexchange.com/questions/534098/glmer-gamma-model-good-fit> but I am still not sure what would be the best way of going about fixing this model. It seems to have poor fit if you look at the plots as they have extremes on both sides, which would not fit well with a gamma distribution. Despite this, the results are consistent across packages (lme4, nlme...). I have 209062 rows of data and this is response time data. I want to determine whether there are differences between groups (Groups - 2 levels) on the learning of a task (Probability - 2 levels) across time (within sessions - Block - 4 levels / across sessions - Session - 2 levels). It doesn't have zero response times, but some close to zero. Do you have any suggestions for how one can improve a model like this or whether I should just use another distribution that fits the data a bit better? Thank you! Catia Model: glmer(RT ~ Prob * Bl * Session * Gr + (1 | Participant), data= Data.trimmed, family = Gamma(link = "log"), control=glmerControl(optimizer="bobyqa")) Model summary: Generalized linear mixed model fit by maximum likelihood (Adaptive Gauss-Hermite Quadrature, nAGQ = 0) ['glmerMod'] Family: Gamma ( log ) Formula: RT ~ Probability * Block * Session * Group + (1 | Participant) Data: Data.trimmed Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 1e+06)) AIC BIC logLik deviance df.resid 2456107 2456538 -1228012 2456023 209020 Scaled residuals: Min 1Q Median 3Q Max -4.297 -0.625 -0.158 0.440 35.691 Random effects: Groups Name Variance Std.Dev. Participant (Intercept) 0.002203 0.04694 Residual 0.053481 0.23126 Number of obs: 209062, groups: Participant, 130 Fixed effects: Estimate Std. Error t value Pr(>|z|) (Intercept) 6.024e+00 4.182e-03 1440.439 < 2e-16 *** Probability1 -2.835e-02 7.041e-04 -40.265 < 2e-16 *** Block2-1 -2.925e-02 2.077e-03 -14.084 < 2e-16 *** Block3-2 -3.676e-03 2.131e-03 -1.725 0.084500 . Block4-3 4.085e-03 2.307e-03 1.771 0.076577 . Block5-4 -1.220e-02 2.380e-03 -5.125 2.98e-07 *** Session1 4.795e-02 7.323e-04 65.478 < 2e-16 *** Group1 -4.616e-02 4.182e-03 -11.037 < 2e-16 *** Probability1:Block2-1 -7.228e-03 2.077e-03 -3.480 0.000501 *** Probability1:Block3-2 -5.332e-03 2.131e-03 -2.503 0.012331 * Probability1:Block4-3 -2.076e-02 2.307e-03 -8.999 < 2e-16 *** Probability1:Block5-4 6.044e-03 2.380e-03 2.539 0.011104 * Probability1:Session1 1.656e-03 7.046e-04 2.351 0.018743 * Block2-1:Session1 -1.972e-02 2.077e-03 -9.494 < 2e-16 *** Block3-2:Session1 -8.521e-03 2.131e-03 -3.999 6.35e-05 *** Block4-3:Session1 4.380e-05 2.308e-03 0.019 0.984856 Block5-4:Session1 -3.768e-03 2.380e-03 -1.583 0.113389 Probability1:Group1 1.515e-03 7.041e-04 2.151 0.031478 * Block2-1:Group1 -6.161e-03 2.077e-03 -2.966 0.003015 ** Block3-2:Group1 -1.129e-02 2.131e-03 -5.301 1.15e-07 *** Block4-3:Group1 7.095e-03 2.307e-03 3.076 0.002101 ** Block5-4:Group1 -4.055e-03 2.380e-03 -1.704 0.088414 . Session1:Group1 3.782e-03 7.323e-04 5.164 2.41e-07 *** Probability1:Block2-1:Session1 5.729e-05 2.077e-03 0.028 0.977997 Probability1:Block3-2:Session1 3.543e-03 2.131e-03 1.663 0.096363 . Probability1:Block4-3:Session1 -6.877e-03 2.308e-03 -2.980 0.002886 ** Probability1:Block5-4:Session1 4.329e-03 2.380e-03 1.819 0.068952 . Probability1:Block2-1:Group1 -1.238e-03 2.077e-03 -0.596 0.550980 Probability1:Block3-2:Group1 1.022e-02 2.131e-03 4.795 1.63e-06 *** Probability1:Block4-3:Group1 -6.532e-03 2.307e-03 -2.831 0.004634 ** Probability1:Block5-4:Group1 2.351e-03 2.380e-03 0.988 0.323373 Probability1:Session1:Group1 -1.805e-03 7.046e-04 -2.562 0.010412 * Block2-1:Session1:Group1 -2.060e-04 2.077e-03 -0.099 0.920984 Block3-2:Session1:Group1 -4.211e-03 2.131e-03 -1.977 0.048094 * Block4-3:Session1:Group1 3.339e-03 2.308e-03 1.447 0.147888 Block5-4:Session1:Group1 -3.956e-03 2.380e-03 -1.662 0.096539 . Probability1:Block2-1:Session1:Group1 -1.270e-03 2.077e-03 -0.611 0.540933 Probability1:Block3-2:Session1:Group1 1.678e-03 2.131e-03 0.788 0.430929 Probability1:Block4-3:Session1:Group1 -4.640e-03 2.308e-03 -2.010 0.044392 * Probability1:Block5-4:Session1:Group1 4.714e-03 2.380e-03 1.980 0.047649 * --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Correlation matrix not shown by default, as p = 40 > 12. Use print(x, correlation=TRUE) or vcov(x) if you need it Plots: [1]: https://i.stack.imgur.com/XPdtl.png [2]: https://i.stack.imgur.com/zUNRX.png [3]: https://i.stack.imgur.com/6slYG.png [4]: https://i.stack.imgur.com/LlRwT.png [5]: https://i.stack.imgur.com/TNYCP.png [6]: https://i.stack.imgur.com/45l0P.png [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models