Hello,
I have in hands a quite large and unbalanced dataset, for which a Y
continuous dependent variable was measured in 3 different conditions (C)
for about 3000 subjects (ID) (although, not all subjects have Y values for
the 3 conditions). Additionally, there is continuous measure W which was
measured for all subjects.
I am interested in testing the following:
- Is the effect of W significant overall
- Is the effect of W significant at each level of C
- Is the effect of C significant
In order to try to answer this, I have specified the following model with
lmer:
lmer( Y ~ W * C + (1 | ID), data = df)
Which seems to proper reflect the structure of the data (I might be wrong
here, any suggestions would be welcome).
However when running the diagnostic plots I noticed a slope in the
residuals plot and a slope different than y = x for the observed vs fitted
plot (as shown bellow). Which made me question the validity of the model
for inference.
Could I still use this model for inference? Should I specify a different
formula? Should I turn to lme and try to include different variances for
each level of conditions (C)? Any ideas?
I would be really appreciated if anyone could help me with this.
Thanks in advance,
Carlos Fam?lia