A question about multicollinear fixed/random factors
Dear Prof. Ver?ssimo, Thanks for your prompt reply. I will proceed to analyze the data with both days (as fixed) and the meeting (as random) included. Yes, I understand the inclusion of session as another random factor and by-word random slopes should be considered seriously. Thanks again, - Kenjiro Matsuda
On 2022/08/20 19:15, Jo?o Ver?ssimo wrote:
If I'm understanding this right, I don't think there is a problem at all. Mixed-effects models can accommodate fixed effects at different levels and these can coexist with the random effects without any issues. Number of days is simply a 'Level-2' predictor, with a unique value for each meeting. (for more complex random-effects structures, you might want to consider the inclusion of session as another random factor and/or of by-word random slopes) Jo?o On 20/08/2022 09:40, N o s t a l g i a wrote:
Hi, I am looking at a character variation in Japanese parliamentary minutes where the same character appears in two forms. In the parliament, there are a number of different committee meetings within the same session, and I am looking at 31 sessions over 10 years. The factors I am considering are: upper/lower house distinction, meetings (meetings within each session, which are different from session to session), days between 1949/5/20 (when the first parliament was held) and the meeting, and the word within which the character appears. Of these, meetings and the words are random factors, and they have hundreds of levels. The total number of cases is over one million. The model I am considering is: glmer (character ~ ul + days + (1|word) + (1|meeting), data = glmmdata.1, family = binomial) And here is my question: Since a given meeting is a unique one not only in each session but in all the data, there would be a multicollinear relationship between the days and the meeting, so that specification of some meeting would necessarily result in a specific value of days. Is it a problem in GLMM to have such pair of fixed and random factors? If it is so, is there any ways to avoid the problem? Thanks in advance, Kenjiro Matsuda Professor in Linguistics Kobe Shoin Women's University Kobe, Japan
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models