Skip to content

Correlation of Fixed Effects

5 messages · Gorjanc Gregor, Ben Bolker, Kingsford Jones +1 more

#
Hi!

The default print method outputs also Correlation of Fixed Effects.
How is this computed and what does it actually represent? I have two models
that essentially give me the same message, but in one model the correlations
between covariates are really high 0.9 and higher, while in other model use of
poly(), reduced correlations a lot! Should I care?

Thanks!

Lep pozdrav / With regards,
    Gregor Gorjanc
----------------------------------------------------------------------
University of Ljubljana       PhD student
Biotechnical Faculty          www: http://gregor.gorjanc.googlepages.com
Department of Animal Science  blog: http://ggorjan.blogspot.com
Groblje 3                     mail: gregor.gorjanc <at> bfro.uni-lj.si
SI-1230 Domzale               fax: +386 (0)1 72 17 888
Slovenia, Europe              tel: +386 (0)1 72 17 861
#
Reducing correlations among fixed effects should
improve numerical stability and may help interpretability
(by allowing estimation of some parameters precisely
rather than spreading variance across several correlated
parameters).  If you're not interested in separating
the effects of the different parameters, and if your
model fits OK either way, I wouldn't say it was critical.
   At least that's my impression.  I'm happy to be
enlightened.

  Ben Bolker
Gorjanc Gregor wrote:

  
    
#
On Mon, Feb 9, 2009 at 2:52 PM, Gorjanc Gregor
<Gregor.Gorjanc at bfro.uni-lj.si> wrote:
Hi Gregor,

In a standard LM it's calculated Cov(\beta) = \sigma^{2}(X'IX)^{-1},
where X is the model design matrix.  In practice \sigma^2 is estimated
by the sum of squared residuals divided by the number of cols in X
minus its rank.

Although I'm guessing here, I assume the equation changes for an LMM
in that the estimate of \sigma^2 becomes a sum of estimated variance
components, and rather than an identity matrix, there may be any
positive definite matrix in between the X' and the X (e.g. if weights
or corr arguments are used in lme we get non-Identity error
covariance).  I tried to confirm this 'guess' by looking at the code
for the vcov method for mer objects, but my S4 skills are too limited
to know how to find it ---  anyone?  (and on a side note -- the fact
that this can (usually) be easily done is yet another reason why I
would be very happy to see Doug's future work to remain in R ;-))

As far as what it represents, as you'd guess the sqrt of the diagonals
are the SEs for the estimated coefficients and the off-diagonals are
the estimated covariances between those estimates.  I suppose another
answer is that the off-diagonals provide indication of the amount of
collinearity in X.
Not surprisingly, when X has columns that are higher-order terms of
another column collinearity occurs and the correlations between
coefficients are high.  'poly' produces orthogonal polynomials, so
covariances of the resulting coefficients should be essentially zero.
The nice thing about that is that terms can be added/removed from the
model without affecting the remaining estimates.  On the other hand,
when estimated coefficients are highly correlated their
interpretations are confounded.

hth,

Kingsford Jones
#
Just to be complete, here's an example of getting the correlation of
fixed effects from the covariance matrix:
Linear mixed model fit by REML
Formula: Reaction ~ Days + (Days | Subject)
   Data: sleepstudy
  AIC  BIC logLik deviance REMLdev
 1756 1775 -871.8     1752    1744
Random effects:
 Groups   Name        Variance Std.Dev. Corr
 Subject  (Intercept) 612.092  24.7405
          Days         35.072   5.9221  0.066
 Residual             654.941  25.5918
Number of obs: 180, groups: Subject, 18

Fixed effects:
            Estimate Std. Error t value
(Intercept)  251.405      6.825   36.84
Days          10.467      1.546    6.77

Correlation of Fixed Effects:
     (Intr)
Days -0.138
2 x 2 Matrix of class "dpoMatrix"
          [,1]      [,2]
[1,] 46.574676 -1.452393
[2,] -1.452393  2.389416
[1] -0.1376783



On Mon, Feb 9, 2009 at 6:01 PM, Kingsford Jones
<kingsfordjones at gmail.com> wrote:
3 days later
#
On Mon, Feb 9, 2009 at 3:52 PM, Gorjanc Gregor
<Gregor.Gorjanc at bfro.uni-lj.si> wrote:

            
It is an approximate correlation of the estimator of the fixed
effects.  (I include the word "approximate" because I should but in
this case the approximation is very good.)  I'm not sure how to
explain it better than that.  Suppose that you took an MCMC sample
from the parameters in the model, then you would expect the sample of
the fixed-effects parameters to display a correlation structure like
this matrix.

As for how it is calculated, look in the vignettes for the definition
of a p by p upper triangular matrix called R_X.  It is returned as the
RX slot in the fitted model.  This matrix is part of the Cholesky
factor in the combined model matrices for the penalized least squares
problem that determines the conditional modes of the random effects
and the conditional estimates of the fixed effects.  If we didn't have
any random effects this would be the R matrix from the QR
decomposition of X.  The same calculation that creates the correlation
of the coefficients in a fixed-effects model from R creates the
correlation of the fixed-effects coefficients from RX here.  See

?chol2inv