Skip to content

Testing for non-linearity and heterogeneity

2 messages · Nick Isaac, Reinhold Kliegl

#
It appears that you want to model nonlinearity with a polynomial
function, that is test whether  the regression of Q on M is better
described by a quadratic than a linear relation. In addition, you test
whether there are reliable differences between units in the linear and
possibly also the quadratic trends.

Such an analysis can be nicely illustrated with the  "sleepstudy" data.
Use poly() to specify the degree of the polynomial. fm1 allows only
varying intercepts, fm2 allows varying intercepts and varying linear
slopes, and fm3 allows varying intercepts, and varying linear and
quadratic trends.

You can use anova() to check whether addition of varying linear trends
and varying quadratic trends leads to significant improvement in
goodness of fit:
Data: sleepstudy
Models:
fm1: Reaction ~ poly(Days, 2) + (1 | Subject)
fm2: Reaction ~ poly(Days, 2) + (poly(Days, 1) | Subject)
fm3: Reaction ~ poly(Days, 2) + (poly(Days, 2) | Subject)
    Df     AIC     BIC  logLik  Chisq Chi Df Pr(>Chisq)
fm1  5 1802.96 1818.92 -896.48
fm2  7 1764.32 1786.67 -875.16 42.642       2    5.5e-10 ***
fm3 10 1757.68 1789.61 -868.84 12.639      3   0.005487 **

Obviously, it does: AIC and BIC decline, logLik grows significantly.
Then, you inspect the CMs (conditional means for LMM; conditional
modes for GLMM and NLMM) with
This will show you that the subject 332 is a bit of an outlier for the
quadratic CMs. Once you remove this subject, there is no significant
reliable between-subject variance for the quadratic trend.
Df    AIC    BIC logLik   Chisq Chi Df Pr(>Chisq)
fm1.2  5 1676.6 1692.3 -833.3
fm2.2  7 1618.4 1640.3 -802.2 62.2002      2  3.115e-14 ***
fm3.2 10 1622.2 1653.6 -801.1  2.1971      3     0.5325

Interestingly, the fixed-effect quadratic trend is not significant
for the original data set, but after removal of Subject 332, it is
consistently so. Thus, if there were an independent reason that
justifies exclusion of Subject 332, Reaction appears to follow a
quadratic trend over days but only the mean (intercept) and the linear
trend across days varies reliably between subjects. (This analysis is
only meant as an illustration. I did not check carefully whether this
conclusion holds up under close scrutiny.)

Please note that with the above model specification the number of
random effects grows very quickly. You should not hold lmer
responsible if it fails to converge for your data. Douglas Bates
provided some alternative specifications in an earlier post to keep
the number of random effects limited. Generally, I think that any
question that appears to require a follow-up analysis of CMs can be
rephrased in such a way that it is appropriately represented in the
model from the outset.

Reinhold Kliegl
On Mon, Jan 19, 2009 at 1:36 PM, Nick Isaac <njbisaac at googlemail.com> wrote: