coef se in lme
On 10/18/07, dave fournier <otter at otter-rsch.com> wrote:
Here is one approach to this problem.
In the AD Model Builder Random Effects package we provide estimated standard deviations for any function of the fixed and random effects, (here I include the parameters which detemine the covarince matrices if present) and the random effects. This is for general nonlinear random effects models, but the calculations can be used for linear models as well. We calculate these estimates as follows. Let L(x,u) be the log-likelihood function for the parameters x and u given the observed data, where u is the vector of random effects and x is the vector of the other parameters.
I know it may sound pedantic but I don't know what a log-likelihood L(x,u) would be because you are treating parameters and the random effects as if they are the same type of object and they're not. If you want to use a Bayesian approach you can kind of level the playing field and say that everything is a parameter except for the observed data values. However, Bayesians also need to come up with a prior and that isn't trivial in this case, as I tried to indicate in my message about the mcmcsamp chain taking excursions. I find I can easily confuse myself in the theory of the maximum likelihood or REML estimates if I am not careful about the terminology and the roles of the different coefficients in the linear predictor. I think I would call that function L(x,u) the conditional density of the random effects given the data. The parameters determine the joint density for the random effects and responses so plugging the observed values of the responses into this expression yields the conditional density of the random effects.
Let F(x) be the log-likelihood for x after the u have been integrated out. This integration might be exact or more commonly via the Laplace approximation or something else. For any x let uhat(x) be the value of u which maximizes L(x,u),
I think that is what I would call the conditional modes of the random effects. These depend on the observed responses and the model parameters.
and let xhat be the value of x which maximizes F(x).
The estimate for the covariance matrix for the x is then S_xx = inv(F_xx) and the estimated full covariance matrix Sigma for the x and u is given by
S_xx S_xx * uhat_x (S_xx * uhat_x)' uhat' * S_xx * uhat_x + inv(L_uu)
where ' denotes transpose _x denotes first derivative wrt x (note that uhat is a function of x so that uhat_x makes sense) and _xx _uu denote the second derivatives wrt x and u. we then use Sigma and the delta method to estimate the standard deviation of any (differentiable) function of x and u.
I'm getting a little bit lost here. In the example you sent based on Harold's discussion, the dimension of x is 3 and the dimension of u is 10 so Sigma is a 13 by 13 matrix, right? S_xx is 3 by 3 and L_uu is 10 by 10. To form the product S_xx*uhat_x I think that uhat_x needs to be 3 by 10. Is that right? (I'm used to writing the Jacobian of a vector-valued function of a vector the other way around.) It looks like you are missing a _x in the first term in "uhat' * S_xx * uhat_x". To evaluate L_uu you need a value of x. I assume you will use the parameter estimates. Correct? Will the parameterization of x affect the result? If I write the model in terms of the logarithms of the variances instead of the variances I will definitely get a different Sigma but will the result for a linear combination of mu and some of the u's stay invariant? If it isn't invariant, how do I choose the parameterization of the variance components? Can you give a bit more detail on how you justify mixing derivatives of the marginal log-likelihood (F) with derivatives of the conditional density (L). Do you know that these are on the same scale? I'm willing to believe that they are - it is just that I can't see right off why they should be.