Skip to content

Differences in degrees of freedom between a mixed-effects model and a gls model using nlme

3 messages · Hufthammer, Karl Ove, Ken Beath, Ben Bolker

#
Dear list members,

I'm having some problems understanding the difference in degrees of freedom between a mixed-effect model and a gls model, both fitted using the nlme package. I'm used to it being difficult to figure out the 'correct' (if such a thing exists) number of degrees of freedom in mixed-effects models, but in the following very simple example the *mixed-effects model* seems to use the correct degrees of freedom, while the gls model uses too many degrees of freedom.

Here's the example. Basically, I want to test if a mixed-effects model is equivalent to a paired t-test.

# Generate some correlated data
set.seed(1)
n = 10
u = 5 + rnorm(n, sd=1)
x1 = u + rnorm(n)
x2 = u + rnorm(n) + 1

# A paired t-test is, as expected, more powerful than an unpaired one
t.test(x1, x2)              # p = .1407
t.test(x1, x2, paired=TRUE) # p = .0597

The relevant line of output for the paired test is:
t = -2.1532, df = 9, p-value = 0.05972

# Fit a linear mixed-effects model
library(nlme)
l = lme(values~ind, random=~1|id, data=df)
summary(l)

The relevant lines of output are:

               Value Std.Error DF   t-value p-value
indx2       0.617482 0.2867705  9  2.153226  0.0597

So we get the same t-value (to five decimal places - it differs in the last decimal), degrees of freedom and p-value as in the paired t-test. Let's now try to fit a gls model for correlated data. This should in theory be equivalent to doing a paired t-test. We get:

#  Fit a correlated gls model
library(nlme)
l2 = gls(values~ind, correlation=corSymm(form=~1|id), data=df)
summary(l2)

               Value Std.Error   t-value p-value
indx2       0.617482 0.2867703  2.153228  0.0451

The t-value is the same as in the t-test (here to *six* decimal place), but the p-value is too small. The reason is that the test uses twice the degrees of freedom that it should. It uses 18 degrees of freedom:

2*pt(2.153228, 18, lower.tail = FALSE) # 0.04510817

So the mixed-effects model gets it right, while the gls model gets it wrong. But it's actually the *gls* model that should be equivalent to the paired t-test, not the mixed-effects model. We can easily see this by changing the correlation to be negative:

set.seed(2)
n = 10
u = 5 + rnorm(n, sd=1)
x1 = u + rnorm(n)
x2 = 15 - 2*u + rnorm(n) + 1
cor(x1, x2) # -0.50

Unpaired t-test:
t = -0.5052, df = 15.753, p-value = 0.6204

Paired t-test (note that the p-value is *greater* than for the unpaired case):
t = -0.418, df = 9, p-value = 0.6857

Mixed-effects model:

               Value Std.Error DF  t-value p-value
indx2       0.481868 0.9537336  9 0.505244  0.6255

Of course, the mixed-effects model doesn't really fit the data. The random effect variance is estimated to be ~0, so basically this corresponds to an *unpaired* t-test (but one that assumes equal variance). Note that the t-statistic is equal to the t-statistic of the unpaired t-test (though the latter uses 18 degrees of freedom, while this model uses 9 degrees of freedom).

Now, if I've understood everything correctly, the gls model should *exactly* correspond to a paired t-test; it's the same underlying model. And indeed we get the same t-statistic (0.418):

               Value Std.Error  t-value p-value
indx2       0.481868 1.1526947 0.418036  0.6809

But the p-value differs somewhat, since the gls model assumes 18 degrees of freedom, while the t-test (correctly) assumes 9 degrees of freedom.

At least for positive correlations, a mixed-effect random intercept model and a gls model with compound symmetry should be *almost* equivalent, so I see no reason that the gls model can assume that the test statistics has twice as many degrees of freedom as in the mixed-effects model. So why the discrepancy? Shouldn't I trust the degrees of freedom calculations from gls()?
1 day later
#
All 3 (paired t-test, mixed effect and gls with compound symmetry) are
fitting the same model, and so should give the same result. That is what
you see with the first example. The gls model is not getting it wrong
except for the df.

For the second the 3 model results should again be the same. I'm not
certain why but it may be numerical. Even though the data come from a model
that isn't correct for the fitting that should be irrelevant, it is the
data that produce the model fit not the model that produces the data.
Possibly estimates of the correlation are poor when there is little
correlation, and that flows through to the mixed effects and glass results.

The relationship to the unpaired t-test is probably irrelevant. Note also
that the default for the t.test is unequal variances whereas for a mixed
model it is equal variances.

The df for gls is obviously in a sense a bug. Getting the df for a mixed
model isn't easy. Here we have a nice simple correlation structure and
there is an obvious correct answer, but usually there isn't one. If the
model assumed uncorrelated data then the gls df would be correct, so it is
necessary for the software to work out what is going on. Using parametric
bootstrapping to determine the underlying distribution seems a better
method if accuracy is important.

Ken

On 6 February 2015 at 19:34, Hufthammer, Karl Ove <
karl.ove.hufthammer at helse-bergen.no> wrote:

            

  
    
#
Ken Beath <ken.beath at ...> writes:
For what it's worth you can easily see what gls() is doing to
get its df, and confirm that it's naive, by printing nlme:::summary.gls:

  tTable[, "p-value"] <- 2 * pt(-abs(tTable[, "t-value"]), 
        dims$N - dims$p)

For what it's worth, I've found that the df calculations used by
lme() often fail quite badly for random-slopes models ... it's often
really hard to guess, even for simpler designs (i.e. where there
really is a precise correspondence with an F distribution -- no correlation
structures or lack of balance or crossed random effects).