Hello, fitting linear mixed models it is often suggested that testing for random effects is not the best idea; mainly because the value of the random effects parameters lie at the boundary of the parameter space. Hence, it is preferred to not test for random effects and rather judge the inclusion of a random effect by the design of the experiment. Or if one really wants to do this use computation intensive methods like parametric bootstraps etc. I have adapted the strategy of not testing for random effects with linear mixed models. Now I'm in a situation were I need to analyse ordinal data in a repeated measures design. The package I decided would best suit this purpose is the ordinal package (suggestions of alternatives are of course welcome). And this got me wondering about random effects again. I was testing a random effect (in fact by accidence as I did a faulty automated regexp substitution) and it got a p of 0.99. More precisely I was testing for the significance of a random slope in contrast to only including a random intercept. As the boundary-of-parameter-space argument is about maximum likelihood estimation in general it also applies to the proportional odds cummulative mixed model. But, and here is were I'm unsure what to do in this particular case the inclusion of a random slope in the clmm will turn a p of 0.004 into 0.1 for my main effect. In contrast all other methods (e.g. treating my response not as an ordered factor but as a continuous variable and using a repeated measures anova) will give me a p of 0.004. This is the only reason why I'm concerned about this. This difference worries me and I'm unsure of what to do. Is it advisable to test here for a random effect? Best, Christian
Random effects in clmm() of package ordinal
2 messages · Christian Brauner, Ben Bolker
On 14-08-29 07:31 AM, Christian Brauner wrote:
Hello, fitting linear mixed models it is often suggested that testing for random effects is not the best idea; mainly because the value of the random effects parameters lie at the boundary of the parameter space. Hence, it is preferred to not test for random effects and rather judge the inclusion of a random effect by the design of the experiment. Or if one really wants to do this use computation intensive methods like parametric bootstraps etc. I have adapted the strategy of not testing for random effects with linear mixed models. Now I'm in a situation were I need to analyse ordinal data in a repeated measures design. The package I decided would best suit this purpose is the ordinal package (suggestions of alternatives are of course welcome). And this got me wondering about random effects again. I was testing a random effect (in fact by accidence as I did a faulty automated regexp substitution) and it got a p of 0.99. More precisely I was testing for the significance of a random slope in contrast to only including a random intercept. As the boundary-of-parameter-space argument is about maximum likelihood estimation in general it also applies to the proportional odds cummulative mixed model. But, and here is were I'm unsure what to do in this particular case the inclusion of a random slope in the clmm will turn a p of 0.004 into 0.1 for my main effect. In contrast all other methods (e.g. treating my response not as an ordered factor but as a continuous variable and using a repeated measures anova) will give me a p of 0.004. This is the only reason why I'm concerned about this. This difference worries me and I'm unsure of what to do. Is it advisable to test here for a random effect? Best, Christian
It sounds like something else is going on. In my experience the advice to not test random effects is based more on philosophy (the random effects are often a nuisance variable that is implicit in the experimental design, and is generally considered necessary for appropriate inference -- see e.g. Hurlbert 1984 _Ecology_ on "sacrificial pseudoreplication") than on the difficulties of inference for random effects (boundary effects, finite-size effects, etc.). A large p-value either means that the point estimate of the RE variance is small, or that its confidence interval is very large (or both); especially in the former case, it is indeed surprising that its inclusion should change inference so much. That's about as much as I think it's possible to say without more detail. I would suggest double-checking your data and model diagnostics (is there something funny about the data and model fit?) and comparing point estimates and confidence intervals from the different fits to try to understand what the different models are saying about the data (not just why the p-value changes so much). Are you using different types of p-value estimation in different models (Wald vs LRT vs ... ?) ? Are you inducing complete separation or severe imbalance by including the RE? Is one of your random-effect levels confounded with your main effect (an example along these lines came up on the list a few months ago: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2014q2/022188.html )? good luck Ben Bolker