Skip to content

Constraining error variance to 0 in a mixed model.

6 messages · Mollie Brooks, Rolf Turner, Dimitris Rizopoulos

#
Let me start off by apologising for beating this issue to death on this 
mailing list.  I have raised it (without getting satisfactory answers) 
on several occasions.  If you are fed up to the teeth with my maundering 
s, please just press the delete button.

This latest occasion was triggered by my trying to learn to use the 
glmmTMB function (from the package of the same name).  I noticed (to my 
initial delight) that glmmTMB() has an argument "dispformula" and the 
help says (in particular) "In Gaussian mixed models, dispformula=~0 
fixes the [dispersion] parameter to be 0, forcing variance into the 
random effects."

I said "Ah-ha!  The very thing."  So I tried it out, using the following 
data set (supplied by Ben Pelzer in a posting to this list, and 
previously used by my very good self to illustrate the problem that I am 
trying to solve):

Dat <- structure(list(person = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L,
4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L,
9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L,
13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L,
18L, 18L, 18L, 19L, 19L, 19L, 20L, 20L, 20L), occasion = structure(c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L,
3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L,
1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("1",
"2", "3"), class = "factor"), test = c(25L, 21L, 27L, 27L, 19L,
23L, 20L, 18L, 18L, 26L, 33L, 46L, 25L, 36L, 47L, 36L, 35L, 41L,
30L, 30L, 37L, 23L, 21L, 19L, 39L, 37L, 33L, 29L, 36L, 49L, 29L,
34L, 44L, 39L, 32L, 30L, 26L, 32L, 36L, 33L, 21L, 8L, 30L, 34L,
34L, 27L, 34L, 39L, 41L, 40L, 44L, 37L, 34L, 34L, 23L, 26L, 30L,
31L, 28L, 27L)), row.names = c(NA, -60L), class = "data.frame")

I tried three ways of "analysing" these data:

(1) Simple-minded multivariate analysis:

X   <- matrix(Dat$test,byrow=TRUE,ncol=3)
colnames(X) <- paste0("occasion",1:3)
mu  <- apply(X,2,mean)
Sig <- var(X)/20 # X has 20 rows

(2) Using lmer():
library(lme4)
fit2 <- lmer(test ~ 0 + occasion + (0 + occasion | person),data=Dat,
              control=lmerControl(check.nobs.vs.nRE = "ignore"))

(The "control" argument was gathered from a posting by Maarten Jung, on 
which I have previously commented.)

Note that this "works" but throws (as I have previously noted)
a disconcerting warning message:
(3) Using glmmTMB():
library(glmmTMB)
fit3 <- glmmTMB(test ~ 0 + occasion + (0 + occasion | person),data=Dat,
                 dispformula = ~0)

(No warning message; initially I said "Wheeeee!!!")

The "parameter estimates" mu, fixef(fit2) and fixef(fit3) all agree:

occasion1 occasion2 occasion3
     29.80     30.05     33.30

The covariance Sig and vcov(fit2) agree modulo numerical noise:

 > Sig
           occasion1 occasion2 occasion3
occasion1 1.7821053  1.150526 0.5768421
occasion2 1.1505263  2.249868 3.0623684
occasion3 0.5768421  3.062368 5.8531579

 > vcov(fit2)
3 x 3 Matrix of class "dpoMatrix"
           occasion1 occasion2 occasion3
occasion1 1.7821054  1.150526 0.5768422
occasion2 1.1505265  2.249868 3.0623683
occasion3 0.5768422  3.062368 5.8531577

However vcov(fit3) differs; after some head scratching I realised that 
it is equal to 19/20 times the others.  After a little more head 
scratching I said ah-ha!  The glmmTMB() function sets REML=FALSE by
default.  So I'll just call it with REML=TRUE:

fit3 <- glmmTMB(test ~ 0+occasion + (0+occasion | person),data=Dat,
                 dispformula = ~0,REML=TRUE)

And now vcov(fit3) more-or-less agrees with the other results:

 > vcov(fit3)
Conditional model:
           occasion1 occasion2 occasion3
occasion1 1.7819337  1.150276 0.5765386
occasion2 1.1502758  2.249755 3.0623895
occasion3 0.5765386  3.062390 5.8534164

But, to my extreme irritation, glmmTMB() now also throws a warning:
(I've looked at the vignette, and I can sort of see a connection, but 
not really.)

*Why* does the bloody universe always do this sort of thing to me?  Why 
on earth should dividing by 19 rather than 20, effectively, cause such a 
warning to be thrown?

Isn't something a little out of whack here?

I reiterate at this point a remark that I made in a previous post on the 
issue of constraining the error variance to 0:
Am I being unreasonable?  (*Me*?  *Unreasonable*?  Perish the thought!) :-)

cheers,

Rolf Turner

P.S.  Note that setting REML=FALSE in the call to lmer() causes it to
produce a covariance matrix that agrees with that produced by glmmTMB()
with default setting.  However this has no impact upon the error message 
issued by lmer().

R. T.
#
For what it's worth, the same model can be fitted with nlme::gls without any warning messages:

library(nlme)
fit4 <- gls(test ~ 0 + occasion, data = Dat, 
            weights = varIdent(form = ~ 1 | occasion),
            correlation = corSymm(form = ~ 1 | person))

vcov(fit4)

Best,
Dimitris


-----Original Message-----
From: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> On Behalf Of Rolf Turner
Sent: Friday, January 11, 2019 3:30 AM
To: Help Mixed Models <r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] Constraining error variance to 0 in a mixed model.


Let me start off by apologising for beating this issue to death on this mailing list.  I have raised it (without getting satisfactory answers) on several occasions.  If you are fed up to the teeth with my maundering s, please just press the delete button.

This latest occasion was triggered by my trying to learn to use the glmmTMB function (from the package of the same name).  I noticed (to my initial delight) that glmmTMB() has an argument "dispformula" and the help says (in particular) "In Gaussian mixed models, dispformula=~0 fixes the [dispersion] parameter to be 0, forcing variance into the random effects."

I said "Ah-ha!  The very thing."  So I tried it out, using the following data set (supplied by Ben Pelzer in a posting to this list, and previously used by my very good self to illustrate the problem that I am trying to solve):

Dat <- structure(list(person = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 4L, 4L, 4L, 5L, 5L, 5L, 6L, 6L, 6L, 7L, 7L, 7L, 8L, 8L, 8L, 9L, 9L, 9L, 10L, 10L, 10L, 11L, 11L, 11L, 12L, 12L, 12L, 13L, 13L, 13L, 14L, 14L, 14L, 15L, 15L, 15L, 16L, 16L, 16L, 17L, 17L, 17L, 18L, 18L, 18L, 19L, 19L, 19L, 20L, 20L, 20L), occasion = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), .Label = c("1", "2", "3"), class = "factor"), test = c(25L, 21L, 27L, 27L, 19L, 23L, 20L, 18L, 18L, 26L, 33L, 46L, 25L, 36L, 47L, 36L, 35L, 41L, 30L, 30L, 37L, 23L, 21L, 19L, 39L, 37L, 33L, 29L, 36L, 49L, 29L, 34L, 44L, 39L, 32L, 30L, 26L, 32L, 36L, 33L, 21L, 8L, 30L, 34L, 34L, 27L, 34L, 39L, 41L, 40L, 44L, 37L, 34L, 34L, 23L, 26L, 30L, 31L, 28L, 27L)), row.names = c(NA, -60L), class = "data.frame")

I tried three ways of "analysing" these data:

(1) Simple-minded multivariate analysis:

X   <- matrix(Dat$test,byrow=TRUE,ncol=3)
colnames(X) <- paste0("occasion",1:3)
mu  <- apply(X,2,mean)
Sig <- var(X)/20 # X has 20 rows

(2) Using lmer():
library(lme4)
fit2 <- lmer(test ~ 0 + occasion + (0 + occasion | person),data=Dat,
              control=lmerControl(check.nobs.vs.nRE = "ignore"))

(The "control" argument was gathered from a posting by Maarten Jung, on which I have previously commented.)

Note that this "works" but throws (as I have previously noted) a disconcerting warning message:
(3) Using glmmTMB():
library(glmmTMB)
fit3 <- glmmTMB(test ~ 0 + occasion + (0 + occasion | person),data=Dat,
                 dispformula = ~0)

(No warning message; initially I said "Wheeeee!!!")

The "parameter estimates" mu, fixef(fit2) and fixef(fit3) all agree:

occasion1 occasion2 occasion3
     29.80     30.05     33.30

The covariance Sig and vcov(fit2) agree modulo numerical noise:

 > Sig
           occasion1 occasion2 occasion3
occasion1 1.7821053  1.150526 0.5768421
occasion2 1.1505263  2.249868 3.0623684
occasion3 0.5768421  3.062368 5.8531579

 > vcov(fit2)
3 x 3 Matrix of class "dpoMatrix"
           occasion1 occasion2 occasion3
occasion1 1.7821054  1.150526 0.5768422
occasion2 1.1505265  2.249868 3.0623683
occasion3 0.5768422  3.062368 5.8531577

However vcov(fit3) differs; after some head scratching I realised that it is equal to 19/20 times the others.  After a little more head scratching I said ah-ha!  The glmmTMB() function sets REML=FALSE by default.  So I'll just call it with REML=TRUE:

fit3 <- glmmTMB(test ~ 0+occasion + (0+occasion | person),data=Dat,
                 dispformula = ~0,REML=TRUE)

And now vcov(fit3) more-or-less agrees with the other results:

 > vcov(fit3)
Conditional model:
           occasion1 occasion2 occasion3
occasion1 1.7819337  1.150276 0.5765386
occasion2 1.1502758  2.249755 3.0623895
occasion3 0.5765386  3.062390 5.8534164

But, to my extreme irritation, glmmTMB() now also throws a warning:
(I've looked at the vignette, and I can sort of see a connection, but not really.)

*Why* does the bloody universe always do this sort of thing to me?  Why on earth should dividing by 19 rather than 20, effectively, cause such a warning to be thrown?

Isn't something a little out of whack here?

I reiterate at this point a remark that I made in a previous post on the issue of constraining the error variance to 0:
Am I being unreasonable?  (*Me*?  *Unreasonable*?  Perish the thought!) :-)

cheers,

Rolf Turner

P.S.  Note that setting REML=FALSE in the call to lmer() causes it to produce a covariance matrix that agrees with that produced by glmmTMB() with default setting.  However this has no impact upon the error message issued by lmer().

R. T.

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

_______________________________________________
R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
Hi Rolf, 

Thanks for the enthusiasm. I believe the ability to force the variance into the RE in glmmTMB was Kasper Kristensen?s idea.

I had a look at your example and it seems that it could be a false positive coming from nlminb. The maximum gradient component is 2.4e-6, so it looks to me like the model converged. 

In the future, the glmmTMB developers will reconsider the convergence check that relies on nlminb?s convergence codes.

if (!is.null(fit$convergence) && fit$convergence != 0) warning()

cheers,
Mollie

  
  
#
On 1/11/19 8:53 PM, D. Rizopoulos wrote:

            
Thanks.  That's interesting, but a bit obscure from the point of view of 
a naive young ( :-) ) thing like myself.  The syntax of the call to 
gls() is mind-boggling --- at least to my feeble mind.

What I really wanted (as I have remarked in similar posts in the past) 
is to be able to fit simple mixed models in a simple-minded way, to 
check that my call to the more sophisticated software is correct.  This 
desideratum kind of falls apart if the sophisticated software 
cannot/won't fit the simple model (at least not without throwing a 
hissy-fit).

I think that what I *can* take away from your fitting procedure is that 
there are/should be no insurmountable intrinsic numerical barriers to 
fitting the simple model that I propose by means of sophisticated software.

cheers,

Rolf
#
On 1/12/19 1:40 AM, Mollie Brooks wrote:

            
Thanks. That's encouraging.

cheers,

Rolf
#
The call to gls() is the one that fits an unstructured covariance matrix for the error terms. Namely, the varIdent() function in the weights arguments specifies that we want a difference variance parameter per occasion, and the corSymm() function in the correlation argument specifies that we want a general symmetric correlation matrix for the measurements of each person.

In any case, a much simpler call to lme() also seems to work, i.e.,

fit5 <- lme(test ~ 0 + occasion, random = ~ 0 + occasion | person, data = Dat)

vcov(fit5)


Best,
Dimitris


-----Original Message-----
From: Rolf Turner <r.turner at auckland.ac.nz> 
Sent: Friday, January 11, 2019 10:08 PM
To: D. Rizopoulos <d.rizopoulos at erasmusmc.nl>
Cc: Help Mixed Models <r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Constraining error variance to 0 in a mixed model.
On 1/11/19 8:53 PM, D. Rizopoulos wrote:

            
Thanks.  That's interesting, but a bit obscure from the point of view of a naive young ( :-) ) thing like myself.  The syntax of the call to
gls() is mind-boggling --- at least to my feeble mind.

What I really wanted (as I have remarked in similar posts in the past) is to be able to fit simple mixed models in a simple-minded way, to check that my call to the more sophisticated software is correct.  This desideratum kind of falls apart if the sophisticated software cannot/won't fit the simple model (at least not without throwing a hissy-fit).

I think that what I *can* take away from your fitting procedure is that there are/should be no insurmountable intrinsic numerical barriers to fitting the simple model that I propose by means of sophisticated software.

cheers,

Rolf

--
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276