coef se in lme

?vcov
The vcov method returns the estimated variance-covariance matrix of
the fixed-effects only.  I think Irene's question is about the
combination of the fixed-effects parameters and the BLUPs of the
random effects that is returned by the coef method applied to an lmer
object.  (You may recall that you were the person who requested such a
method in lme4 like the coef method in nlme :-)

On the face of it this quantity should be easy to define and evaluate
but in fact it is not easy to do so because these are combinations of
model parameters (the fixed effects) and unobserved random variables
(the random effects).  It gets a bit tricky trying to decide what the
variance of this combination would be.  I think there is a sensible
definition, or at least a computationally reasonable definition, but
there are still a few slippery points in the argument.

Lately I have taken to referring to the "estimates" of the random
effects, what are sometimes called the BLUPs or Best Linear Unbiased
Predictors, as the "conditional modes" of the random effects.  That
is, they are the values that maximize the density of the random
effects given the observed data and the values of the model
parameters.  For a linear mixed model the conditional distribution of
the random effects is multivariate normal so the conditional modes are
also the conditional means.  Also, we can evaluate the conditional
variance-covariance matrix of the random effects up to a scale factor.

The next part is where things get a bit hazy for me but I think it
makes sense to consider the joint distribution of the estimator of the
fixed-effects parameters and the random effects conditional on the
data and, possibly, on the variance components.  Conditional on the
relative variance-covariance of the random effects (i.e. the matrix
that occurs as the penalty term in the penalized least squares
representation of the model) the joint distribution of the
fixed-effects estimators and the random effects is multivariate normal
with mean and variance-covariance matrix determined from the
mixed-model equations.

This big (p+q by p+q, where p is the dimension of the fixed effects
and q is the dimension of the random effects) variance-covariance
matrix could be evaluated and, from that, the variance of any linear
combination of components.  However, I have my doubts about whether it
is the most sensible answer to evaluate.  Conditioning on the relative
variance-covariance matrix of the random effects is cheating, in a
way.  It would be like saying we have a known variance, $\sigma^2$
when, in fact, we are using an estimate.  The fact that we don't know
$\sigma^2$ is what gives rise to the t distributions and F
distributions in linear models and we are all trained to pay careful
attention to the number of degrees of freedom in that estimate and how
it affects our ideas of the precision of the estimates of other model
parameters.  For mixed models, though, many practioners are quite
comfortable conditioning on the value of some of the variance
components but not others.  It could turn out that conditioning on the
relative variance-covariance of the random effects is not a big deal
but I don't know.  I haven't examined it in detail and I don't know of
others who have.

Another approach entirely is to use Markov chain Monte Carlo to
examine the joint distribution of the parameters (in the Bayesian
sense) and the random effects.  If you save the fixed effects and the
random effects from the MCMC chain then you can evaluate the linear
combination of interest throughout the chain and get an empirical
distribution of the quantities returned by coef.

This is probably an unsatisfactory answer for Irene who may have
wanted something quick and simple.  Unfortunately, I don't think there
is a quick, simple answer here.

I suggest we move this discussion to the R-SIG-Mixed-Models list which
I am cc:ing on this reply.
-----Original Message-----
From: r-help-bounces at r-project.org on behalf of Irene Mantzouni
Sent: Mon 10/15/2007 3:20 PM
To: r-help at stat.math.ethz.ch
Subject: [R] coef se in lme

Hi all!

How is it possible to estimate  standard errors for coef obtained from lme?
Is there sth like se.coef() for lmer or what is the anaytical solution?

Thank you!

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

I'm particularly interested in this, so I'll bite. Let me start with
something very basic and build up later as I think more about this.

Assume we have a model such as

Y_ij = \mu + a_i + e_ij \quad a_i \sim N(0, s^2), e_ij \sim N(0, r^2)

In this basic model, we also assume the variance components are
orthogonal, denoted e_ij \bot a_i

Now, from ranef() I get the estimate of a for all i (what Doug calls the
conditional modes) and, also using the postVar argument in this
extractor function I get variances of these conditional modes. 

Now, vcov() would return the variance for the fixed parameter \mu.

Given that we can find the variances for each term in the model, and
that they are assumed orthogonal, why is it that we can't simply take
the variance of a linear combination? 

For example, if I want to find:

\hat{y}_ij = \mu + a_i

Then the variance of this would be simply

Var(\hat{y}_ij) = var(\mu) + var(a_i)

And in this case there is no covariance given e_ij \bot a_i, right? 

Even if this scenario is too simplistic, I assume I can use a taylor
series expansion to derive something that gives me an estimate of the
variance of \hat{y}_ij in closed form that is pretty close, no?

OK, I'm ready for my lashing.

Harold
-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf 
Of Douglas Bates
Sent: Wednesday, October 17, 2007 4:05 PM
To: Doran, Harold
Cc: Irene Mantzouni; r-help at stat.math.ethz.ch; R-SIG-Mixed-Models
Subject: Re: [R] coef se in lme

On 10/15/07, Doran, Harold <HDoran at air.org> wrote:
?vcov
The vcov method returns the estimated variance-covariance 
matrix of the fixed-effects only.  I think Irene's question 
is about the combination of the fixed-effects parameters and 
the BLUPs of the random effects that is returned by the coef 
method applied to an lmer object.  (You may recall that you 
were the person who requested such a method in lme4 like the 
coef method in nlme :-)

On the face of it this quantity should be easy to define and 
evaluate but in fact it is not easy to do so because these 
are combinations of model parameters (the fixed effects) and 
unobserved random variables (the random effects).  It gets a 
bit tricky trying to decide what the variance of this 
combination would be.  I think there is a sensible 
definition, or at least a computationally reasonable 
definition, but there are still a few slippery points in the argument.

Lately I have taken to referring to the "estimates" of the 
random effects, what are sometimes called the BLUPs or Best 
Linear Unbiased Predictors, as the "conditional modes" of the 
random effects.  That is, they are the values that maximize 
the density of the random effects given the observed data and 
the values of the model parameters.  For a linear mixed model 
the conditional distribution of the random effects is 
multivariate normal so the conditional modes are also the 
conditional means.  Also, we can evaluate the conditional 
variance-covariance matrix of the random effects up to a scale factor.

The next part is where things get a bit hazy for me but I 
think it makes sense to consider the joint distribution of 
the estimator of the fixed-effects parameters and the random 
effects conditional on the data and, possibly, on the 
variance components.  Conditional on the relative 
variance-covariance of the random effects (i.e. the matrix 
that occurs as the penalty term in the penalized least 
squares representation of the model) the joint distribution 
of the fixed-effects estimators and the random effects is 
multivariate normal with mean and variance-covariance matrix 
determined from the mixed-model equations.

This big (p+q by p+q, where p is the dimension of the fixed 
effects and q is the dimension of the random effects) 
variance-covariance matrix could be evaluated and, from that, 
the variance of any linear combination of components.  
However, I have my doubts about whether it is the most 
sensible answer to evaluate.  Conditioning on the relative 
variance-covariance matrix of the random effects is cheating, 
in a way.  It would be like saying we have a known variance, 
$\sigma^2$ when, in fact, we are using an estimate.  The fact 
that we don't know $\sigma^2$ is what gives rise to the t 
distributions and F distributions in linear models and we are 
all trained to pay careful attention to the number of degrees 
of freedom in that estimate and how it affects our ideas of 
the precision of the estimates of other model parameters.  
For mixed models, though, many practioners are quite 
comfortable conditioning on the value of some of the variance 
components but not others.  It could turn out that 
conditioning on the relative variance-covariance of the 
random effects is not a big deal but I don't know.  I haven't 
examined it in detail and I don't know of others who have.

Another approach entirely is to use Markov chain Monte Carlo 
to examine the joint distribution of the parameters (in the Bayesian
sense) and the random effects.  If you save the fixed effects 
and the random effects from the MCMC chain then you can 
evaluate the linear combination of interest throughout the 
chain and get an empirical distribution of the quantities 
returned by coef.

This is probably an unsatisfactory answer for Irene who may 
have wanted something quick and simple.  Unfortunately, I 
don't think there is a quick, simple answer here.

I suggest we move this discussion to the R-SIG-Mixed-Models 
list which I am cc:ing on this reply.

-----Original Message-----
From: r-help-bounces at r-project.org on behalf of Irene Mantzouni
Sent: Mon 10/15/2007 3:20 PM
To: r-help at stat.math.ethz.ch
Subject: [R] coef se in lme

Hi all!

How is it possible to estimate  standard errors for coef 
obtained from lme?
Is there sth like se.coef() for lmer or what is the 
anaytical solution?
Thank you!

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

I'm particularly interested in this, so I'll bite. Let me start with
something very basic and build up later as I think more about this.
Assume we have a model such as
Y_ij = \mu + a_i + e_ij \quad a_i \sim N(0, s^2), e_ij \sim N(0, r^2)
In this basic model, we also assume the variance components are
orthogonal, denoted e_ij \bot a_i
Now, from ranef() I get the estimate of a for all i (what Doug calls the
conditional modes) and, also using the postVar argument in this
extractor function I get variances of these conditional modes.
Now, vcov() would return the variance for the fixed parameter \mu.
Given that we can find the variances for each term in the model, and
that they are assumed orthogonal, why is it that we can't simply take
the variance of a linear combination?
Because the conditional distribution of the random effects (given the
data) depends on the fixed effects.  If you change \mu you will get
different conditonal modes for the a_i and you must take this into
account when considering the variation in the combination you show
below.
For example, if I want to find:
\hat{y}_ij = \mu + a_i
Then the variance of this would be simply
Var(\hat{y}_ij) = var(\mu) + var(a_i)
That would apply if \hat{\mu} and a_i were independent, but they aren't.

For this example (and thank you for suggesting a simple example for
discussion), the calculation I suggested would be conditional on the
ratio s/r or, equivalently, the ratio s^2/r^2.  I refer to this as the
"relative variance" of the random effects or, in more general cases,
the relative variance-covariance of the random effects.

If we condition on both variance components, s^2 and r^2, the joint
distribution of the random effects and the estimators of the fixed
effects is multivariate normal.  The mean of this distribution is the
solution to a penalized linear least squares problem.  Exactly the
same calculations as we would use for a linear least squares problem
will give us the joint distribution of the estimators of the fixed
effects and the random effects marginalized over r^2 but conditional
on the ratio s^2/r^2.

The "marginalized over r^2" of that last sentence is what leads us to
t distributions instead of z distributions and, after suitable
normalization, F distributions instead of chi-squared distributions.
What I haven't quite got a handle on yet is whether it is as important
to consider the effects of uncertainty in our estimate of s^2/r^2 as
it is to consider the effects of uncertainty in our estimate of s^2.
For fixed effects models we know that in some circumstances it is very
important to consider the uncertainty in our estimate of r^2.  This is
what Gosset did in formulating the T distribution and what Fisher
generalized into the analysis of variance.

Can we reasonably assume that variability in our estimate of r^2 is
important but variability in our estimate of s^2/r^2 is not?  If so
then we can apply the usual results from least squares to the
penalized least squares problem (conditional on the ratio r^2/s^2) and
we're done.  We need the model matrices for the fixed and the random
effects and the penalty matrix and we do need to be careful of how the
computation is done, especially in cases such as you consider where
the number of random effects can get into the millions, but the theory
is straightforward.  However, I don't think it is that cut and dried
in all cases.

It is likely that, in your world, conditioning on an estimate of
s^2/r^2 would not be a stretch.  In fact, you could condition on an
estimate of r^2 while you were at it because you have a huge number of
degrees of freedom in your estimate of the residual variance.  The
differences between a T distribution with 10 million degrees of
freedom and a standard normal distribution are not worth worrying
about.

So I think that in your world it would be reasonable to use the
penalized least squares representation and calculate the results from
that.  You mentioned above the "posterior variances" (which I would
call conditional variances if I were to do it over again) of the
random effects.  These come from the penalized least squares
representation but conditional on the fixed effects.  The get the
joint distribution of the random effects and the estimators of the
fixed effects it is only necessary to extend that calculation a bit.
From the joint distribution we can get the variance of the linear
combination you want to consider.

For smaller data sets it may be worthwhile going the Markov chain
Monte Carlo route because that takes into account the variation in all
the parameter estimates and in the random effects.  However, Steven
Novick recently sent me mail with a reproducible example about
difficulties with the Markov chain Monte Carlo samples drawn from some
models.  I'll send a slightly expanded version of that report to the
list for comment later today.
And in this case there is no covariance given e_ij \bot a_i, right?

Even if this scenario is too simplistic, I assume I can use a taylor
series expansion to derive something that gives me an estimate of the
variance of \hat{y}_ij in closed form that is pretty close, no?

OK, I'm ready for my lashing.

Harold

-----Original Message-----
From: dmbates at gmail.com [mailto:dmbates at gmail.com] On Behalf
Of Douglas Bates
Sent: Wednesday, October 17, 2007 4:05 PM
To: Doran, Harold
Cc: Irene Mantzouni; r-help at stat.math.ethz.ch; R-SIG-Mixed-Models
Subject: Re: [R] coef se in lme

On 10/15/07, Doran, Harold <HDoran at air.org> wrote:
?vcov
The vcov method returns the estimated variance-covariance
matrix of the fixed-effects only.  I think Irene's question
is about the combination of the fixed-effects parameters and
the BLUPs of the random effects that is returned by the coef
method applied to an lmer object.  (You may recall that you
were the person who requested such a method in lme4 like the
coef method in nlme :-)

On the face of it this quantity should be easy to define and
evaluate but in fact it is not easy to do so because these
are combinations of model parameters (the fixed effects) and
unobserved random variables (the random effects).  It gets a
bit tricky trying to decide what the variance of this
combination would be.  I think there is a sensible
definition, or at least a computationally reasonable
definition, but there are still a few slippery points in the argument.

Lately I have taken to referring to the "estimates" of the
random effects, what are sometimes called the BLUPs or Best
Linear Unbiased Predictors, as the "conditional modes" of the
random effects.  That is, they are the values that maximize
the density of the random effects given the observed data and
the values of the model parameters.  For a linear mixed model
the conditional distribution of the random effects is
multivariate normal so the conditional modes are also the
conditional means.  Also, we can evaluate the conditional
variance-covariance matrix of the random effects up to a scale factor.

The next part is where things get a bit hazy for me but I
think it makes sense to consider the joint distribution of
the estimator of the fixed-effects parameters and the random
effects conditional on the data and, possibly, on the
variance components.  Conditional on the relative
variance-covariance of the random effects (i.e. the matrix
that occurs as the penalty term in the penalized least
squares representation of the model) the joint distribution
of the fixed-effects estimators and the random effects is
multivariate normal with mean and variance-covariance matrix
determined from the mixed-model equations.

This big (p+q by p+q, where p is the dimension of the fixed
effects and q is the dimension of the random effects)
variance-covariance matrix could be evaluated and, from that,
the variance of any linear combination of components.
However, I have my doubts about whether it is the most
sensible answer to evaluate.  Conditioning on the relative
variance-covariance matrix of the random effects is cheating,
in a way.  It would be like saying we have a known variance,
$\sigma^2$ when, in fact, we are using an estimate.  The fact
that we don't know $\sigma^2$ is what gives rise to the t
distributions and F distributions in linear models and we are
all trained to pay careful attention to the number of degrees
of freedom in that estimate and how it affects our ideas of
the precision of the estimates of other model parameters.
For mixed models, though, many practioners are quite
comfortable conditioning on the value of some of the variance
components but not others.  It could turn out that
conditioning on the relative variance-covariance of the
random effects is not a big deal but I don't know.  I haven't
examined it in detail and I don't know of others who have.

Another approach entirely is to use Markov chain Monte Carlo
to examine the joint distribution of the parameters (in the Bayesian
sense) and the random effects.  If you save the fixed effects
and the random effects from the MCMC chain then you can
evaluate the linear combination of interest throughout the
chain and get an empirical distribution of the quantities
returned by coef.

This is probably an unsatisfactory answer for Irene who may
have wanted something quick and simple.  Unfortunately, I
don't think there is a quick, simple answer here.

I suggest we move this discussion to the R-SIG-Mixed-Models
list which I am cc:ing on this reply.

-----Original Message-----
From: r-help-bounces at r-project.org on behalf of Irene Mantzouni
Sent: Mon 10/15/2007 3:20 PM
To: r-help at stat.math.ethz.ch
Subject: [R] coef se in lme

Hi all!

How is it possible to estimate  standard errors for coef
obtained from lme?
Is there sth like se.coef() for lmer or what is the
anaytical solution?
Thank you!

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

The AD Model Builder method I posted earlier takes into account the
uncertainty in the mean and both std deviations in Harold's simple model.

        Y_ij - mu + u_i +eps_ij

To illustrate this
I built a little model and simulated a data set with 1<=i<=10, 1<=j<=5 
observations.  Below are the parameter estiamtes tigether with their
estimated std devs.  The true values were mu=3.0 sigma_u=2.0 and sigma=3.0
    1   mu           2.4091e+00 7.4307e-01
    2   log_sigma_u  6.3908e-01 3.4913e-01
    3   log_sigma    1.1354e+00 1.1180e-01
    4   u           -3.1695e-01 6.4561e-01
    5   u            2.0441e-01 6.4496e-01
    6   u            5.1996e-01 6.4748e-01
    7   u            4.3754e-01 6.4661e-01
    8   u            4.8381e-01 6.4708e-01
    9   u            1.2214e+00 6.6075e-01
   10   u           -1.7631e+00 6.7793e-01
   11   u            3.3987e-01 6.4578e-01
   12   u           -1.7663e-01 6.4485e-01
   13   u           -9.5030e-01 6.5439e-01
then I fixed log_sigma_u and log_sigma at their
estimated values and obtained.
# fixed sd's
 index   name   value      std dev
     1   mu  2.4093e+00 7.4307e-01
     2   u  -3.1736e-01 6.4407e-01
     3   u   2.0456e-01 6.4407e-01
     4   u   5.2045e-01 6.4407e-01
     5   u   4.3794e-01 6.4407e-01
     6   u   4.8426e-01 6.4407e-01
     7   u   1.2226e+00 6.4407e-01
     8   u  -1.7650e+00 6.4407e-01
     9   u   3.4016e-01 6.4407e-01
    10   u  -1.7689e-01 6.4407e-01
    11   u  -9.5138e-01 6.4407e-01
and finally I almost fixed mu
(can't fix it completley because then there would be
no "fixed" effects and the model thinks there is
nothing to do.) and obtained
# all fixed
 index   name   value      std dev
     1   mu  2.4093e+00 2.2361e-03
     2   u  -3.1736e-01 5.9145e-01
     3   u   2.0456e-01 5.9145e-01
     4   u   5.2045e-01 5.9145e-01
     5   u   4.3794e-01 5.9145e-01
     6   u   4.8426e-01 5.9145e-01
     7   u   1.2226e+00 5.9145e-01
     8   u  -1.7650e+00 5.9145e-01
     9   u   3.4016e-01 5.9145e-01
    10   u  -1.7689e-01 5.9145e-01
    11   u  -9.5138e-01 5.9145e-01
So for example u(1) the first random effect has estimated std devs

  6.4561e-01, 6.4407e-01, and 5.9145e-01

under the three models. It appears that most of the "extra" uncertainty 
in u(1) comes from the uncertainty in mu.
David A. Fournier
P.O. Box 2040,
Sidney, B.C. V8l 3S3
Canada
Phone/FAX 250-655-3364
http://otter-rsch.com