glmer conditional deviance DECREASING after removal of fixed effect??
I may have attributed the confusion about conditional and marginal deviances to you, Juho, when in fact it is the fault of the authors of lme4, of whom I am one. If so, I apologize. We (the authors of lme4) had a discussion about the definition of deviance many years ago and I am not sure what the upshot was. It may have been that we went with this incorrect definition of the "conditional deviance". Unfortunately my R skills are sufficiently atrophied that I haven't been able to look up what is actually evaluated. The area of generalized linear models and generalized linear mixed models holds many traps for the unwary in terms of definitions, and some casual definitions in the original glm function, going as far back as the S3 language, aid in the confusion. For example, the dev.resid function in a GLM family doesn't return the deviance residuals - it returns the square of the deviance residuals, which should be named the "unit deviances". Given a response vector and the predicted mean vector and a distribution family one can define the unit deviances, which play a role similar to the squared residual in the Gaussian family. The sum of these unit deviances is the deviance of a generalized linear model, but it is not the deviance of a generalized linear mixed model, because the likelihood for a GLMM involves both a "fidelity to the data" term *and* a "complexity of the model" term, which is the size of the random effects vector in a certain metric. So if indeed we used an inappropriate definition of deviance the mistake is ours and we should correct it.
On Thu, Aug 31, 2023 at 7:56?AM Douglas Bates <dmbates at gmail.com> wrote:
You say you are comparing conditional deviances rather than marginal
deviances. Can you expand on that a bit further? How are you evaluating
the conditional deviances?
The models are being fit according to what you describe as the marginal
deviance - what I would call the deviance. It is not surprising that what
you are calling the conditional deviance is inconsistent with the nesting
of the models because they weren't fit according to that criterion.
julia> m01 = let f = @formula y ~ 1 + x1 + x2 + x3 + x4 + (1|id)
fit(MixedModel, f, dat, Bernoulli(); contrasts, nAGQ=9)
end
Minimizing 141 Time: 0:00:00 ( 1.22 ms/it)
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 9)
y ~ 1 + x1 + x2 + x3 + x4 + (1 | id)
Distribution: Bernoulli{Float64}
Link: LogitLink()
logLik deviance AIC AICc BIC
-1450.8163 2900.8511 2915.6326 2915.6833 2955.5600
Variance components:
Column Variance Std.Dev.
id (Intercept) 0.270823 0.520406
Number of obs: 2217; levels of grouping factors: 331
Fixed-effects parameters:
????????????????????????????????????????????????????
Coef. Std. Error z Pr(>|z|)
????????????????????????????????????????????????????
(Intercept) 0.0969256 0.140211 0.69 0.4894
x1: 1 -0.0449824 0.0553361 -0.81 0.4163
x2 0.0744891 0.0133743 5.57 <1e-07
x3 -0.548392 0.0914109 -6.00 <1e-08
x4: B 0.390359 0.0803063 4.86 <1e-05
x4: C 0.299932 0.0991249 3.03 0.0025
????????????????????????????????????????????????????
julia> m02 = let f = @formula y ~ 1 + x1 + x2 + x3 + (1|id)
fit(MixedModel, f, dat, Bernoulli(); contrasts, nAGQ=9)
end
Generalized Linear Mixed Model fit by maximum likelihood (nAGQ = 9)
y ~ 1 + x1 + x2 + x3 + (1 | id)
Distribution: Bernoulli{Float64}
Link: LogitLink()
logLik deviance AIC AICc BIC
-1472.4551 2944.0654 2954.9102 2954.9373 2983.4297
Variance components:
Column Variance Std.Dev.
id (Intercept) 0.274331 0.523766
Number of obs: 2217; levels of grouping factors: 331
Fixed-effects parameters:
????????????????????????????????????????????????????
Coef. Std. Error z Pr(>|z|)
????????????????????????????????????????????????????
(Intercept) -0.448496 0.100915 -4.44 <1e-05
x1: 1 -0.0527684 0.0547746 -0.96 0.3354
x2 0.0694393 0.0131541 5.28 <1e-06
x3 -0.556903 0.0904798 -6.15 <1e-09
????????????????????????????????????????????????????
julia> MixedModels.likelihoodratiotest(m02, m01)
Model Formulae
1: y ~ 1 + x1 + x2 + x3 + (1 | id)
2: y ~ 1 + x1 + x2 + x3 + x4 + (1 | id)
??????????????????????????????????????????????????
model-dof deviance ?? ??-dof P(>??)
??????????????????????????????????????????????????
[1] 5 2944.0654
[2] 7 2900.8511 43.2144 2 <1e-09
??????????????????????????????????????????????????
I would note that your data are so imbalanced with respect to id that it
is not surprising that you get unstable results. (I changed your id column
from integers to a factor so that 33 becomes S033.)
331?2 DataFrame
Row ? id nrow
? String Int64
?????????????????????
1 ? S033 1227
2 ? S134 46
3 ? S295 45
4 ? S127 41
5 ? S125 33
6 ? S228 31
7 ? S193 23
8 ? S064 18
9 ? S281 16
10 ? S055 13
11 ? S035 13
12 ? S091 12
13 ? S175 11
14 ? S284 10
15 ? S159 10
? ? ? ?
317 ? S324 1
318 ? S115 1
319 ? S192 1
320 ? S201 1
321 ? S156 1
322 ? S202 1
323 ? S067 1
324 ? S264 1
325 ? S023 1
326 ? S090 1
327 ? S195 1
328 ? S170 1
329 ? S241 1
330 ? S189 1
331 ? S213 1
301 rows omitted
On Thu, Aug 31, 2023 at 3:02?AM Juho Kristian Ruohonen <
juho.kristian.ruohonen at gmail.com> wrote:
Hi, I thought it was impossible for deviance to decrease when a term is removed!?!? Yet I'm seeing it happen with this pair of relatively simple Bernoulli GLMMs fit using lme4:glmer():
full <- glmer(y ~ (1|id) + x1 + x2 + x3 + x4, family = binomial, nAGQ = 6, data = anon)
reduced <- update(full, ~. -x1)
c(full = deviance(full), reduced = deviance(reduced))
* full reduced* *2808.671 2807.374 * What on earth going on? FYI, I am deliberately comparing conditional deviances rather than marginal ones, because quite a few of the clusters are of inherent interest and likely to recur in future data. My anonymized datafile is attached. Best, Juho
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models