Dear Jarrod
Thank you so much and sorry for the confusion. I recalculated Cinv
with the original specification
m1 <- lmer(Difference ~ 1+ (1|Examiner) + (1|Item), data=englisho.data)
and it is now symmetrical. I will assume the following ordering
[overall intercept, uexaminer1...uexaminer22, uitem1...uitem24] and
will use attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,] for
the variances of random effects and Cinv for the covariances with
fixed effects.
All the best,
Daniel
On Fri, Feb 8, 2013 at 3:18 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk> wrote:
Dear Daniel,
Cinv should be symmetrical (although it might differ slightly because of
rounding errors when inverting). The row/columns should be [overall
intercept,
,uitem1...uitem24, uexaminer1...uexaminer22] for your m1 model in this
posting. However, in your original posting you fitted item/examiner in a
different order:
m1 <- lmer(Difference ~ 1+ (1|Examiner) + (1|Item), data=englisho.data)
and it was for this model for which my Cinv code would work.
I thought
attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,]
would be equal to diag(Cinv[1+1:ne]) (from the original specification:
Examiner before Item) but actually it seems closer to something like this:
C = t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi)
Cinv = solve(C[-c(1:nb),-c(1:nb)])
diag(Cinv)[1:ne]
I don't think this is a good thing - it assumes the fixed effects, as well
as the variance parameters, are known without error.
Cheers,
Jarrod
Quoting Daniel Caro <dcarov at gmail.com> on Fri, 8 Feb 2013 14:32:53 +0000:
Dear Jarrod
Thank you for the practical advice and thank you all for the
interesting comments.
I was able to calculate Cinv. In my model,
m1 <- lmer(Difference ~ 1+ (1|Item) + (1|Examiner), data=englisho.data)
this matrix is of size 47x47, which is equal to the number of Items
(ni=24) + number of Examiners (ne=22) + number of fixed effects
(nb=1). Or p (1)+q (46) in the lme4 manual. Cinv is not symmetric and
I am wondering how to interpret the position of covariances in the
matrix. I thought the row/columns would be: [overall intercept,
uitem1...uitem24, uexaminer1...uexaminer22]. But it seems not? And is
the diagonal for the random effects for the examiners, for example,
comparable with the results of
attr(ranef(m1, postVar=T)[["Examiner"]], "postVar")[1,1,]
?
Again, thank you for your help.
All the best,
Daniel
On Thu, Feb 7, 2013 at 11:01 PM, Jarrod Hadfield <j.hadfield at ed.ac.uk>
wrote:
Dear Daniel,
To give a practical, rather than philosophical, solution to your problem
you
could perhaps try this....
W<-cBind(X,Z)
Cinv =
solve(t(W)%*%W/Vr+bdiag(Diagonal(nb)*0,Diagonal(ne)/Ve,Diagonal(ni)/Vi))
where Vr is the residual variance, nb is the number of fixed effects, ne
is
the number of examiners and Ve the examiner variance estimate and ni is
the
number of items and Vi the item variance estimate.
Cinv is the (co)variance matrix of the fixed and random effects (stacked
on
top of each other). If you take the sub-matrix of Cinv that pertains to
the
effects in your prediction, then summing the elements of that sub-matrix
will give you something that is close to the prediction variance. In
general
the sub-matrix will not be diagonal (i.e. there are covariances between
the
random effects and the things we are *calling* fixed effects). Vr, Ve and
Vn
are assumed known. If you want to include their uncertainty in the
prediction interval you will probably have to wear the B-badge more
visibly
on your sleeve I'm afraid.
Cheers,
Jarrod
Quoting "Doran, Harold" <HDoran at air.org> on Thu, 7 Feb 2013 10:02:49
-0500:
Doug:
Sorry if this is too simple a question. Call \hat{\beta}} the estimate
of
a fixed parameter and \hat{b_i} the BLUP for unit i.
If the following exists
\hat{\theta} = \hat{\beta}} + \hat{b_i}
Then, does var(\hat{\theta}) also exist?
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf Of Douglas
Bates
Sent: Thursday, February 07, 2013 9:57 AM
To: Doran, Harold
Cc: Andrew Robinson; dcarov at gmail.com; r-sig-mixed-models at r-project.org;
Ben Bolker
Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
On Wed, Feb 6, 2013 at 9:36 AM, Doran, Harold
<HDoran at air.org<mailto:HDoran at air.org>> wrote:
Andrew
Ignoring the important theoretical question for just a moment on whether
it is sensible to do this, there is a covariance term between the BLUPs
and
the fixed effects.
But the picky mathematician in me can't understand in what distribution
this covariance occurs. It makes sense in the Bayesian formulation but
not
in a classical (sampling theory) formulation. The distribution of the
estimator of the fixed effects for known values of the parameters is a
multivariate normal that depends on \beta, \sigma^2 and \Sigma, the
model
matrices X and Z being known. The random variable B doesn't enter into
it.
(One way of writing this variance-covariance of this multivariate normal
is
X'V^{-1}X where V is that matrix that involves Z and Sigma^{-1} - I have
forgotten the exact form).
I know these considerations sound like needless theoretical niceties but
to me they're not. I have to be able to formulate the theoretical basis
before I can make sense of the computational results and, after 20 years
or
so, I'm still having trouble making sense of this.
If that term exists under the model, but it is ignored for purposes of
variance estimation, what would be the reason for doing so?
-----Original Message-----
From:
r-sig-mixed-models-bounces at r-project.org<mailto:r-sig-mixed-models-bounces at r-project.org>
[mailto:r-sig-mixed-<mailto:r-sig-mixed->
models-bounces at r-project.org<mailto:models-bounces at r-project.org>] On
Behalf Of Andrew Robinson
Sent: Tuesday, February 05, 2013 9:15 PM
To: dcarov at gmail.com<mailto:dcarov at gmail.com>
Cc:
r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>;
Ben Bolker
Subject: Re: [R-sig-ME] [R] lmer - BLUP prediction intervals
I think that it is a reasonable way to proceed just so long as you
interpret the
intervals guardedly and document your assumptions carefully.
Cheers
Andrew
On Wednesday, February 6, 2013, Daniel Caro wrote:
Dear all
I have not been able to follow the discussion. But I would like to
know if it makes sense to calculate prediction intervals like this:
var(fixed effect+random effect)= var(fixed effect) + var(random
effect) + 0 (i.e., the cov is zero)
and based on this create the prediction intervals. Does this make
sense?
All the best,
Daniel
On Tue, Feb 5, 2013 at 8:54 PM, Douglas Bates
<bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
On Tue, Feb 5, 2013 at 2:14 PM, Andrew Robinson <
A.Robinson at ms.unimelb.edu.au<mailto:A.Robinson at ms.unimelb.edu.au>>
wrote:
I'd have thought that the joint correlation matrix would be of the
estimates of the fixed effects and the random effects, rather than
the things themselves.
Well, it may be because I have turned into a grumpy old man but I
get
about terminology and the random effects are not parameters - they
are unobserved random variables. They don't have "estimates" in
the
sense of parameter estimates. The quantities returned by the ranef
function are
conditional means (in the case of a linear mixed model, conditional
modes in general) of the random effects given the observed data
evaluated with the parameters at their estimated values. In the
Bayesian point of view none of this is problematic because they're
all random variables but otherwise I struggle with the
interpretation of how these can be
jointly. If you want to consider the distribution of the random
effects
you need to have known values of the parameters.
The estimates are statistical quantities, with specified
distributions, under the model. The model posits these different
roles (parameter,
variable) for the quantities that are the targets of the
estimates,
but
estimates are just estimates, and as such, they have a correlation
structure under the model, and that correlation structure can be
An imperfect analogy from least-squares regression is the
correlation structure of residual estimates, induced by the model.
We say that the errors are independent, but the model creates a
(modest) correlation structure than can be measured, again,
conditional
Well the residuals are random variables and we can show that at the
least squares estimates of the parameters they will have a known
Gaussian distribution which, it turns out, doesn't depend on the
values of the coefficients. But those are the easy cases. In the
linear mixed model
still have a Gaussian distribution and a linear predictor but that
is for the conditional distribution of the response given the
random
the complete model things get much messier.
I'm not making these points just to be difficult. I have spent a
lot of time thinking about these models and trying to come up with
a
coherent
of describing them. Along the way I have come to the conclusion
that the way these models are often described is, well, wrong. And
those descriptions include some that I have written. For example,
you often
the model described as the linear predictor for an observation plus
a "noise" term, epsilon, and the statement that the distribution of
the random effects is independent of the distribution of the noise
term. I
view the linear predictor as a part of the conditional distribution
of
response given the random effects so it wouldn't make sense to talk
about these distributions being independent. The biggest pitfall
in
your thinking from a linear model to any other kind (GLM, LMM,
GLMM)
is
fact that we can make sense of a Gaussian distribution minus its
mean so
write the linear model in the "signal plus noise" form as Y =
X\beta
+ \epsilon where Y is an n-dimensional random variable, X is the n
by p
matrix, \beta is the p-dimensional vector of coefficients and
\epsilon is an n-dimensional Gaussian with mean zero. That doesn't
work in the other cases, despite the heroic attempts of many people
to write things in that way.
Here endeth the sermon.
On Wed, Feb 6, 2013 at 6:34 AM, Douglas Bates
<bates at stat.wisc.edu<mailto:bates at stat.wisc.edu>>
It is possible to create the correlation matrix of the fixed
effects
the random effects jointly using the results from lmer but I have
difficulty deciding what this would represent statistically. If
you