Hi there,
I tried to follow the suggestions in Bates et al.'s parsimonious mixed model
paper (link <https://arxiv.org/abs/1506.04967> ) and I'm having some
problems. Basically, I started with a maximal model (m0) and then a
zero-correlation-parameter model (m1), then I reduced the model by taking
out near-zero variance components (m2). Once I extended the reduced model
with correlation parameters (m3), the model becomes degenerate (some
correlation parameters are 1 or -1). However, the likelihood ratio test
(anova()) suggests that m3 provides a significantly better fit than m2. How
should I proceed? What would be the optimal model in cases like this?
I'm not sure if I did something wrong in the process, so I'll detail what I
did below.
Thanks!
Wing-Yee Chow
==================================
My experiment has a 2x2 within-participant design with 24 participants and
48 items. Both factors (Congruity and SentenceType) are categorical and I
specified the contrasts this way:
cCongruity||subj) + (cSenttype * cCongruity||item), REML=FALSE,
data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + ((1 | subj) + (0 + cSenttype |
subj) + (0 + cCongruity | subj) + (0 + cSenttype:cCongruity |
subj)) + ((1 | item) + (0 + cSenttype | item) + (0 + cCongruity |
item) + (0 + cSenttype:cCongruity | item))
Data: tempdata
AIC BIC logLik deviance df.resid
14419.9 14484.5 -7196.9 14393.9 1057
Scaled residuals:
Min 1Q Median 3Q Max
-2.4709 -0.6110 -0.1823 0.4011 6.1156
Random effects:
Groups Name Variance Std.Dev.
item cSenttype:cCongruity 0.000e+00 0.000e+00
item.1 cCongruity 0.000e+00 0.000e+00
item.2 cSenttype 2.478e+03 4.978e+01
item.3 (Intercept) 5.268e+03 7.258e+01
subj cSenttype:cCongruity 0.000e+00 0.000e+00
subj.1 cCongruity 3.473e+01 5.893e+00
subj.2 cSenttype 3.819e-11 6.180e-06
subj.3 (Intercept) 1.079e+04 1.039e+02
Residual 3.543e+04 1.882e+02
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.41 24.34 15.957
sentencetype1 57.23 13.59 4.210
congruity1 -17.72 11.60 -1.527
sentencetype1:congruity1 -20.81 23.07 -0.902
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 -0.002
congruity1 0.000 -0.001
sntnctyp1:1 0.000 -0.002 -0.009
Once again rePCA() suggests that the model is over-parameterised:
summary(rePCA(m1))
$item
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 0.3856 0.2645 0 0
Proportion of Variance 0.6801 0.3200 0 0
Cumulative Proportion 0.6801 1.0000 1 1
$subj
Importance of components:
[,1] [,2] [,3] [,4]
Standard deviation 0.5518 0.03131 3.283e-08 0
Proportion of Variance 0.9968 0.00321 0.000e+00 0
Cumulative Proportion 0.9968 1.00000 1.000e+00 1
So then I reduced the model by taking out 4 variance components that are
zero/near-zero, and rePCA suggests that this reduced model is no longer
over-parameterised:
(cSenttype||item),REML=FALSE, data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + ((1 | subj) + (0 + cCongruity |
subj)) + ((1 | item) + (0 + cSenttype | item))
Data: tempdata
AIC BIC logLik deviance df.resid
14411.9 14456.6 -7196.9 14393.9 1061
Scaled residuals:
Min 1Q Median 3Q Max
-2.4709 -0.6110 -0.1823 0.4011 6.1156
Random effects:
Groups Name Variance Std.Dev.
item cSenttype 2478.42 49.784
item.1 (Intercept) 5267.94 72.581
subj cCongruity 34.73 5.894
subj.1 (Intercept) 10785.48 103.853
Residual 35428.11 188.224
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.41 24.34 15.957
sentencetype1 57.23 13.59 4.210
congruity1 -17.72 11.60 -1.527
sentencetype1:congruity1 -20.81 23.07 -0.902
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 -0.002
congruity1 0.000 -0.001
sntnctyp1:1 0.000 -0.002 -0.009
summary(rePCA(m2))
$item
Importance of components:
[,1] [,2]
Standard deviation 0.3856 0.2645
Proportion of Variance 0.6801 0.3200
Cumulative Proportion 0.6801 1.0000
$subj
Importance of components:
[,1] [,2]
Standard deviation 0.5518 0.03131
Proportion of Variance 0.9968 0.00321
Cumulative Proportion 0.9968 1.00000
Then I followed Bates et al's guidelines in extending this reduced model
with correlation parameters:
(cSenttype|item),REML=FALSE, data=tempdata))
Linear mixed model fit by maximum likelihood ['lmerMod']
Formula: value ~ sentencetype * congruity + (cCongruity | subj) + (cSenttype
| item)
Data: tempdata
AIC BIC logLik deviance df.resid
14395.2 14450.0 -7186.6 14373.2 1059
Scaled residuals:
Min 1Q Median 3Q Max
-2.6509 -0.6383 -0.1679 0.4151 5.9618
Random effects:
Groups Name Variance Std.Dev. Corr
item (Intercept) 5391.4 73.43
cSenttype 3085.8 55.55 1.00
subj (Intercept) 10803.3 103.94
cCongruity 826.1 28.74 -1.00
Residual 35022.7 187.14
Number of obs: 1070, groups: item, 48; subj, 24
Fixed effects:
Estimate Std. Error t value
(Intercept) 388.76 24.40 15.933
sentencetype1 56.88 13.99 4.066
congruity1 -18.10 12.88 -1.405
sentencetype1:congruity1 -20.94 22.93 -0.913
Correlation of Fixed Effects:
(Intr) sntnc1 cngrt1
sentenctyp1 0.247
congruity1 -0.396 -0.001
sntnctyp1:1 0.000 -0.002 -0.007
The correlation parameters of 1 and -1 suggest that this model is
degenerate. This is consistent with the results of rePCA():
summary(rePCA(m3))
$item
Importance of components:
[,1] [,2]
Standard deviation 0.492 0
Proportion of Variance 1.000 0
Cumulative Proportion 1.000 1
$subj
Importance of components:
[,1] [,2]
Standard deviation 0.5762 1.678e-07
Proportion of Variance 1.0000 0.000e+00
Cumulative Proportion 1.0000 1.000e+00
However, the likelihood ration test suggests that m3 provides a
significantly better fit than m2:
anova(m2, m3)
Data: tempdata
Models:
m2: value ~ sentencetype * congruity + ((1 | subj) + (0 + cCongruity |
m2: subj)) + ((1 | item) + (0 + cSenttype | item))
m3: value ~ sentencetype * congruity + (cCongruity | subj) + (cSenttype |
m3: item)
Df AIC BIC logLik deviance Chisq Chi Df Pr(>Chisq)
m2 9 14412 14457 -7196.9 14394
m3 11 14395 14450 -7186.6 14373 20.639 2 3.298e-05 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Questions: Should I have reduced the model further before extending it with
correlation parameters? How should I proceed (or what should I have done
differently)?