negative variances
Negative variances, unless they can be explained as statistical error, surely indicate that the model that is formulated expecting variances to be positive is inappropriate. My preference is to do what MLwiN does, and allow the estimates to go negative, just so that the user is alerted to problems with the model, or at least with its formulation. The model that has negative "variance" components may from the point of view of getting the variance-covariance structure correct be OK, just formulated using parameters that, notwithstanding a fervent wish to interpret them as variances, cannot be so interpreted. I once heard, from people providing a data analysis service, of receiving randomized block treatment comparison data where it turned out that the blocks had been chosen, e.g. with their plots at increasing distances from a stream. A negative between block component of "variance" was, given the design, to be expected. I did myself once encounter a scientist who thought that blocks should be chosen, as far as possible, so that each individual block spanned as large a part of the variation as possible. A problem with SEMs is that this stuff typically hides under the hood. It is not obvious how the user might get to look under the hood (at the layout of the blocks, maybe) and comment, "Ah, I might have guessed as much." Graphical models (the name is not as revealing as one might like of the nature of these models) can be a huge advance because they do try to pull apart what goes into SEMs, to look under the hood. See the gR task view for R, that you can get to from a CRAN mirror. John Maindonald email: john.maindonald at anu.edu.au phone : +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for Mathematics & Its Applications, Room 1194, John Dedman Mathematical Sciences Building (Building 27) Australian National University, Canberra ACT 0200.
On 12 Apr 2007, at 6:54 PM, Tu Yu-Kang wrote:
Dear Prof Bates, Many thanks for your email. I tried lmer() and received the following messages:
lm2<-lmer(ppd~month+(month|id))
Warning message: Estimated variance-covariance for factor 'id' is singular in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200, tolerance = 1.49011611938477e-08, I then tried lmer2():
lm3<-lmer2(ppd~month+(month|id)) summary(lm3)
Linear mixed-effects model fit by REML AIC BIC logLik MLdeviance
REMLdeviance
1146 1166 -567.9 1126 1136
Random effects:
Groups Name Variance Std.Dev. Corr id (Intercept)
0.1566631 0.395807 month 0.0022331 0.047255
1.000 Residual 0.6391120 0.799445 Number of obs:
420, groups: id, 140
Fixed effects:
Estimate Std. Error t value
(Intercept) 6.43595 0.07017 91.72
month -0.36619 0.01642 -22.30
Correlation of Fixed Effects:
(Intr)
month -0.544
However, I am not sure about the results, because MLwiN showed both
random effects were negative values (-0.196 and -0.023).
I start to notice this problems of negative variances when I am
learning how to use structural equation modeling software to run
multilevel models for longitudinal data. To my great surprise, it
occurs quite frequently. In SEM, this problem sometimes may be
overcome by estimating a nonlinear model by freeing the factor
loadings. For example, in this data, PPD (probing pocket depth) was
measured three times at month 0, 3 and 6. I only fixed the first
and last factor loadings to be 0 and 6 to get a non-linear
relation, and I also allow the level-1 residuals to be different on
each occasion. However, in some data, I failed to get a satifactory
model no matter how I modified my models.
I looked for the discussion in several multilevel modeling
textbooks but only found one short discussion in the book by Brown
and Prescott. SEM literature usually suggest fixing the negative
variances to 0. However, I wander whether this is the only way to
get around this problem or the sensible way because if the random
effects are fixed to 0 the model is no longer a random effects model.
With best regards,
Yu-Kang
From: "Douglas Bates" <bates at stat.wisc.edu> To: "Tu Yu-Kang" <yukangtu at hotmail.com> CC: r-help at stat.math.ethz.ch, r-sig-mixed-models at r-project.org Subject: Re: [R] negative variances Date: Wed, 11 Apr 2007 09:15:21 -0500 On 4/11/07, Tu Yu-Kang <yukangtu at hotmail.com> wrote:
Dear R experts, I had a question which may not be directly relevant to R but I will be grateful if you can give me some advices. I ran a two-level multilevel model for data with repeated measurements over time, i.e. level-1 the repeated measures and level-2 subjects. I could not get convergence using lme(), so I tried MLwiN, which eventually showed the level-2 variances (random effects for the intercept and slope) were negative values. I know this is known as Heywood cases in the structural equation modeling literature, but the only discussion on this problem in the literature of multilevel models and random effects models I can find is in the book by Prescott and Brown. Any suggestion on how to solve this problem will be highly appreciated.
It is possible that the ML or REML estimates for a variance component can be zero. The algorithm used in lme doesn't perform well in this situation which is one reason that the lmer and lmer2 functions in the lme4 package were created. Could you try fitting the model with those or provide us with the data so we can check it out? I recommend moving this discussion to the R-SIG-mixed-models mailing list which I am copying on this reply.
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models