Hello,
I have in hands a quite large and unbalanced dataset, for which a Y continuous dependent variable was measured in 3 different conditions (C) for about 3000 subjects (ID) (although, not all subjects have Y values for the 3 conditions). Additionally, there is continuous measure W which was measured for all subjects.
I am interested in testing the following:
- Is the effect of W significant overall
- Is the effect of W significant at each level of C
- Is the effect of C significant
In order to try to answer this, I have specified the following model with lmer:
lmer( Y ~ W * C + (1 | ID), data = df)
Which seems to proper reflect the structure of the data (I might be wrong here, any suggestions would be welcome).
However when running the diagnostic plots I noticed a slope in the residuals plot and a slope different than y = x for the observed vs fitted plot (as shown bellow). Which made me question the validity of the model for inference.
Could I still use this model for inference? Should I specify a different formula? Should I turn to lme and try to include different variances for each level of conditions (C)? Any ideas?
I would be really appreciated if anyone could help me with this.
Thanks in advance,
Carlos Fam?lia