Skip to content

negative variances

6 messages · Tu Yu-Kang, John Maindonald, Douglas Bates

#
On 4/11/07, Tu Yu-Kang <yukangtu at hotmail.com> wrote:
It is possible that the ML or REML estimates for a variance component
can be zero.  The algorithm used in lme doesn't perform well in this
situation which is one reason that the lmer and lmer2 functions in the
lme4 package were created.  Could you try fitting the model with those
or provide us with the data so we can check it out?

I recommend moving this discussion to the R-SIG-mixed-models mailing
list which I am copying on this reply.
#
Dear Prof Bates,

Many thanks for your email. I tried lmer() and received the following 
messages:
Warning message:
Estimated variance-covariance for factor 'id' is singular
 in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200, tolerance = 
1.49011611938477e-08,  

I then tried lmer2():
Linear mixed-effects model fit by REML 
  AIC  BIC logLik MLdeviance REMLdeviance
 1146 1166 -567.9       1126         1136
Random effects:
 Groups   Name        Variance  Std.Dev. Corr  
 id       (Intercept) 0.1566631 0.395807       
          month       0.0022331 0.047255 1.000 
 Residual             0.6391120 0.799445       
Number of obs: 420, groups: id, 140

Fixed effects:
            Estimate Std. Error t value
(Intercept)  6.43595    0.07017   91.72
month       -0.36619    0.01642  -22.30

Correlation of Fixed Effects:
      (Intr)
month -0.544


However, I am not sure about the results, because MLwiN showed both random 
effects were negative values (-0.196 and -0.023).

I start to notice this problems of negative variances when I am learning 
how to use structural equation modeling software to run multilevel models 
for longitudinal data. To my great surprise, it occurs quite frequently. In 
SEM, this problem sometimes may be overcome by estimating  a nonlinear 
model by freeing the factor loadings. For example, in this data, PPD 
(probing pocket depth) was measured three times at month 0, 3 and 6. I only 
fixed the first and last factor loadings to be 0 and 6 to get a non-linear 
relation, and I also allow the level-1 residuals to be different on each 
occasion. However, in some data, I failed to get a satifactory model no 
matter how I modified my models.

I looked for the discussion in several multilevel modeling textbooks but 
only found one short discussion in the book by Brown and Prescott. SEM 
literature usually suggest fixing the negative variances to 0. However, I 
wander whether this is the only way to get around this problem or the 
sensible way because if the random effects are fixed to 0 the model is no 
longer a random effects model.

With best regards,

Yu-Kang
#
Negative variances, unless they can be explained as statistical error,
surely indicate that the model that is formulated expecting variances
to be positive is inappropriate.  My preference is to do what MLwiN
does, and allow the estimates to go negative, just so that the user is
alerted to problems with the model, or at least with its formulation.

The model that has negative "variance" components may from the
point of view of getting the variance-covariance structure correct be
OK, just formulated using parameters that, notwithstanding a fervent
wish to interpret them as variances, cannot be so interpreted.

I once heard, from people providing a data analysis service, of
receiving randomized block treatment comparison data where it
turned out that the blocks had been chosen, e.g. with their plots
at increasing distances from a stream.  A negative between block
component of "variance" was, given the design, to be expected.
I did myself once encounter a scientist who thought that blocks
should be chosen, as far as possible, so that each individual block
spanned as large a part of the variation as possible.

A problem with SEMs is that this stuff typically hides under the hood.
It is not obvious how the user might get to look under the hood
(at the layout of the blocks, maybe) and comment, "Ah, I might have
guessed as much."

Graphical models (the name is not as revealing as one might like
of the nature of these models) can be a huge advance because they
do try to pull apart what goes into SEMs, to look under the hood.
See the gR task view for R, that you can get to from a CRAN mirror.

John Maindonald             email: john.maindonald at anu.edu.au
phone : +61 2 (6125)3473    fax  : +61 2(6125)5549
Centre for Mathematics & Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
On 12 Apr 2007, at 6:54 PM, Tu Yu-Kang wrote:

            
#
On 4/12/07, Tu Yu-Kang <yukangtu at hotmail.com> wrote:
I'm a bit pressed for time right now so this will be brief.  I suggest
that you add the optional argument

control = list(msVerbose = TRUE)

to the calls to lmer and to lmer2 and look at the progress of the
iterations and also at the value of the deviance.  I have seen
situations where lmer2 converges to a fit with a significantly smaller
deviance than can lmer because it progresses through the near-singular
region of the parameter space.

What was the deviance (or the log-likelihood) from the MLWin fit that
gave negative variance components?

Did you plot the data as ppd versus month by id using xyplot from the
lattice package?  That should always be the first step in the analysis
of longitudinal data.
#
Dear Prof Bates,
Warning message:
Estimated variance-covariance for factor 'id' is singular
 in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200, tolerance = 
1.49011611938477e-08,
0      1191.73: 0.888889 0.0592593  0.00000
  1      1174.58: 5.00000e-10 5.00000e-10 0.281419
  2      1158.10: 0.956617 0.00924174 -0.00978256
  3      1151.88: 0.875101 5.00000e-10 0.550741
  4      1144.22: 0.605105 5.00000e-10 0.550741
  5      1142.59: 0.335108 5.00000e-10 0.550741
  6      1142.06: 0.452561 5.00000e-10 0.550741
  7      1141.97: 0.423318 5.00000e-10 0.550740
  8      1141.97: 0.416524 5.00000e-10 0.550740
  9      1141.97: 0.417039 5.00000e-10 0.550740
Warning message:
Estimated variance-covariance for factor 'id' is singular
 in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200, tolerance = 
1.49011611938477e-08,
0      1191.73: 0.942809 0.243432  0.00000
  1      1181.40: 0.870948  0.00000 -0.124082
  2      1146.91: 0.855545 2.18615e-05 0.0163376
  3      1140.17: 0.413096  0.00000 0.0798584
  4      1137.48: 0.484553 2.13939e-09 0.179021
  5      1136.23: 0.542748  0.00000 0.0799814
  6      1135.87: 0.508893 0.0117775 0.110609
  7      1135.79: 0.496375  0.00000 0.115248
  8      1135.79: 0.497595  0.00000 0.117998
  9      1135.79: 0.494810 2.92361e-08 0.119135
 10      1135.79: 0.494925 1.82196e-08 0.119321
 11      1135.79: 0.495102  0.00000 0.119390
 12      1135.79: 0.495102  0.00000 0.119390
The -2loglikelihhood from MLwiN is 1104.594. I did plot the data and they 
looked fine (at least to me).

With best regards,

Yu-Kang
2 days later
#
On 4/12/07, Tu Yu-Kang <yukangtu at hotmail.com> wrote:
Both of those optimization paths show demonstrate convergence to a
singular variance-covariance matrix.  The first two parameters being
optimized are on the scale of variances (lmer) or standard deviations
(lmer2) and are constrained to be non-negative.  In the case of lmer
they need to be constrained to be slightly positive (>= 1e-10).  You
can see that both case have converged on the boundary although lmer2
achieved a better minimum (lower deviance).

The first parameter represents the relative variance (or relative
standard deviation for lmer2) of the intercept random effect.  The
second parameter is a relative variance (standard deviation) of a
linear combination of the random effects.  The third parameter
determines the linear combination.

To me this indicates model failure.  There are finite variances for
both the slope and intercept random effects but they  are perfectly
correlated.