comparing posterior means
Dear Douglas, Harold and Emmanuel, The rather technical discussion is difficult to follow for me and I still feel unsure about what the right answer is to my question. To test the difference between the two posterior means, does it make sense to test for the difference of the two random effects , u0A-u0B, and use the (co)variances of both to test H0: u0A - u0B = 0? Or should the variance of the fixed effect b0 also be taken into account (which I now believe is not needed)? Please help me. Humble regards, Ben.
On 15-10-2014 20:16, Douglas Bates wrote:
On Wed, Oct 15, 2014 at 12:24 PM, Doran, Harold <HDoran at air.org
<mailto:HDoran at air.org>> wrote:
Doug:
I grab the variance/covariance matrix of the random effects in a
way that I think will make you cringe, but will share it here and
am interested in learning how it can be done more efficiently.
Keep in mind basically two principles. First, my lme.eiv function
(which stands for linear mixed model error-in-variables) uses
henderson's equations as typically described and then stands on
your shoulders and uses almost all of the functions in the Matrix
package for sparse matrices.
BTW, though my function is not available in a package, I am happy
to share it with you. It has complete technical documentation and
is written using S3 methods with various common extractor
functions typically used (and more). The function is intended to
be used when the variables on the RHS are measured with error.
Otherwise, my function simply matches lmer's output.
So, let me use the following to represent the linear model as
Xb = y
Assume X is the leftmost matrix in henderson's equation (so it is
a big 2x2 blocked matrix), b is a vector holding both the fixed
and random effects, and y is the outcome. The matrix X is big (and
very sparse in my applications) and so I find its Cholesky
decomposition and then simply solve the triangular systems until a
solution is reached.
I think I am missing a couple of steps here. Henderson's mixed model
equations, as described, say, in the Wikipedia entry, are a set of
penalized normal equations. That is, they look like
X'X b = X'y
but with a block structure and with a penalty on the random effects
crossproduct matrix. In the Wikipedia article the lower right block
is written as Z'R^{-1}Z + G^{-1} and I would describe G^{-1} as the
penalty term in that it penalizes large values of the coefficients b.
In lmer a similar penalized least squares (PLS) problem is solved for
each evaluation of the profiled log-likelihood or profiled REML
criterion. The difference is that instead of solving for the
conditional means of the random effects on the original scale we solve
for the conditional means of the "spherical" random effects. Also, the
order of the coefficients in the PLS problem is switched so that the
random effects come before the fixed effects. Doing things this way
allows for evaluation of the evaluation of the profiled log-likelihood
from the determinant of the sparse Cholesky factor and the penalized
residual sum-of-squares (equations 34 and 41, for REML) in the paper
on arxiv.org <http://arxiv.org>.
After a solution is reached (here is where you will cringe), I
find the inverse of the big matrix X as
mat1.inv <- solve(L, I)
where L is the Cholesky factor of X and I is a conformable
identity matrix. This, in essence, finds the inverse of the big
matrix X, and then I can grab the variances and covariances of
everything I need from this after the solution is reached.
You're right. I did cringe. At the risk of sounding like a broken
record you really don't need to calculate the inverse of that large,
sparse matrix to get the information you want. At most you have to
solve block by block.
I'm currently doing something similar in the Julia implementation.
The aforementioned arxiv.org <http://arxiv.org> paper gives
expressions for the gradient of the profiled log-likelihood. It can
be a big win to evaluate the analytic gradient when optimizing the
criterion. The only problem is that you need to do nearly as much
work to evaluate the gradient as to invert the big matrix. Many years
ago when Saikat and I derived those expressions I thought it was great
and coded up the optimization using that. I fit a model to a large,
complex data set using only the function evaluation, which took a
couple of hours. Then I started it again using the gradient
evaluation, eagerly anticipating how fast it would be. I watched and
watched and eventually went to bed because it was taking so long. The
next morning I got up to find that it had completed one iteration in
twelve hours.
It is possible to do that evaluation for certain model types, but the
general evaluation by just pushing it into a sparse matrix calculation
can be formidable. The Julia implementation does use the gradient but
only for models with nested grouping factors (and that is not yet
complete) or for models with two crossed or nearly crossed grouping
factors.
So in terms of the original question, it would be reasonable to
evaluate the conditional covariance of the random effects under
similar designs - either nested grouping factors, which includes, as a
trivial case, a single grouping factor, or crossed grouping factors.
In the latter case, however, the covariance matrix will be large and
dense so you may question whether it is really important to evaluate
it. I can make sense of 2 by 2 covariance matrices and maybe 3 by 3
but 85 by 85 dense covariance matrices are difficult for me to interpret.
*From:*dmbates at gmail.com <mailto:dmbates at gmail.com>
[mailto:dmbates at gmail.com <mailto:dmbates at gmail.com>] *On Behalf
Of *Douglas Bates
*Sent:* Wednesday, October 15, 2014 12:37 PM
*To:* Doran, Harold
*Cc:* Ben Pelzer; r-sig-mixed-models at r-project.org
<mailto:r-sig-mixed-models at r-project.org>
*Subject:* Re: [R-sig-ME] comparing posterior means
On Wed, Oct 15, 2014 at 9:30 AM, Doran, Harold <HDoran at air.org
<mailto:HDoran at air.org>> wrote:
Ben
Yes, you can do this comparison of the conditional means using the
variance of the linear combination AND there is in fact a
covariance term between them. I do not believe that covariance
term between BLUPs is available in lmer (I wrote my own mixed
model function that does spit this out, however).
Just to be didactic for a moment. Take a look at Henderson's
equation(say at the link below)
http://en.wikipedia.org/wiki/Mixed_model
The covariance term between the blups that you would need comes
from the lower right block of the leftmost matrix at the final
solution. Lmer is not parameterized this way, so the comparison is
not intended to show how that term would be extracted from lmer.
Only to show that is does exist in the likelihood and can
(conceivably) be extracted or computed from the terms given by lmer.
I would disagree, Harold, about the relationship between the
formulation used in lmer and that in Henderson's mixed model
equations. There is a strong relationship, which is explicitly
shown in http://arxiv.org/abs/1406.5823
Also shown there is why the modifications from Henderson's
formulation to that in lmer lead to flexibility in model
formulation and much greater speed and stability in fitting such
models. Reversing the positions of the random effects and fixed
effects in the penalized least squares problem and using a
relative covariance factor instead of the covariance matrix allows
for the profiled log-likelihood or profiled REML criterion to be
evaluated. Furthermore, they allow for the sparse Cholesky
decomposition to be used effectively. (Henderson's formulation
does do as good a job of preserving sparsity.)
I believe you want the conditional variance-covariance matrix for
the random effects given the observed data and the parameter
values. The sparse Cholesky factor L is the Cholesky factor of
that variance-covariance, up to the scale factor. It is, in fact
more stable to work with the factor L than to try to evaluate the
variance-covariance matrix itself.
I'm happy to flesh this out in private correspondence if you wish.
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
<mailto:r-sig-mixed-models-bounces at r-project.org>
[mailto:r-sig-mixed-models-bounces at r-project.org
<mailto:r-sig-mixed-models-bounces at r-project.org>] On Behalf
Of Ben Pelzer
Sent: Wednesday, October 15, 2014 8:56 AM
To: r-sig-mixed-models at r-project.org
<mailto:r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] comparing posterior means
Dear list,
Suppose we have the following two-level null-model, for data
from respondents (lowest level 1) living in countries (highest
level 2):
Y(ij) = b0j + eij = (b0 + u0j) + eij
b0j is the country-mean for country j
b0 is the "grand mean"
u0j is the deviation from the grand mean for country j, or the
level-2 residual eij is the level-1 residual
The model is estimated by : lmer(Y ~ 1+(1|country))
My question is about comparing two particular posterior
country-means.
As for as I know, for a given country j, the posterior mean is
equal to
bb0 + uu0j, where bb0 is the estimate of b0 and uu0j is the
posterior residual estimate of u0j.
Two compare two particular posterior country means and test
whether they differ significantly, would it be necessary to
know the variance of
bb0+uu0j for each of the two countries, or would it be
sufficient to
only know the variance of uu0j?
The latter variance (of uu0j) can be extracted using
rr <- ranef(modela, condVar=TRUE)
attr(rr[[1]], "postVar")
However, the variance of bb0+uu0j also depends on the variance
of bb0
and the covariance of bb0 and uu0j (if this covariance is not
equal to
zero, of course, which I don't know...).
On the other hand, the difference between two posterior
country means
for country A and B say, would
equal bb0 + u0A -(bb0 + u0B) = u0A - u0B meaning that I
wouldn't need to
worry about the variance of bb0.
So my main question is about comparing and testing the difference
between two posterior country means. Thanks for any help,
Ben.
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models