Hi!
The default print method outputs also Correlation of Fixed Effects.
How is this computed and what does it actually represent? I have two models
that essentially give me the same message, but in one model the correlations
between covariates are really high 0.9 and higher, while in other model use of
poly(), reduced correlations a lot! Should I care?
Thanks!
Lep pozdrav / With regards,
Gregor Gorjanc
----------------------------------------------------------------------
University of Ljubljana PhD student
Biotechnical Faculty www: http://gregor.gorjanc.googlepages.com
Department of Animal Science blog: http://ggorjan.blogspot.com
Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si
SI-1230 Domzale fax: +386 (0)1 72 17 888
Slovenia, Europe tel: +386 (0)1 72 17 861
Correlation of Fixed Effects
5 messages · Gorjanc Gregor, Ben Bolker, Kingsford Jones +1 more
Reducing correlations among fixed effects should improve numerical stability and may help interpretability (by allowing estimation of some parameters precisely rather than spreading variance across several correlated parameters). If you're not interested in separating the effects of the different parameters, and if your model fits OK either way, I wouldn't say it was critical. At least that's my impression. I'm happy to be enlightened. Ben Bolker
Gorjanc Gregor wrote:
Hi!
The default print method outputs also Correlation of Fixed Effects.
How is this computed and what does it actually represent? I have two models
that essentially give me the same message, but in one model the correlations
between covariates are really high 0.9 and higher, while in other model use of
poly(), reduced correlations a lot! Should I care?
Thanks!
Lep pozdrav / With regards,
Gregor Gorjanc
----------------------------------------------------------------------
University of Ljubljana PhD student
Biotechnical Faculty www: http://gregor.gorjanc.googlepages.com
Department of Animal Science blog: http://ggorjan.blogspot.com
Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si
SI-1230 Domzale fax: +386 (0)1 72 17 888
Slovenia, Europe tel: +386 (0)1 72 17 861
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Ben Bolker Associate professor, Biology Dep't, Univ. of Florida bolker at ufl.edu / www.zoology.ufl.edu/bolker GPG key: www.zoology.ufl.edu/bolker/benbolker-publickey.asc
On Mon, Feb 9, 2009 at 2:52 PM, Gorjanc Gregor
<Gregor.Gorjanc at bfro.uni-lj.si> wrote:
Hi! The default print method outputs also Correlation of Fixed Effects. How is this computed and what does it actually represent?
Hi Gregor,
In a standard LM it's calculated Cov(\beta) = \sigma^{2}(X'IX)^{-1},
where X is the model design matrix. In practice \sigma^2 is estimated
by the sum of squared residuals divided by the number of cols in X
minus its rank.
Although I'm guessing here, I assume the equation changes for an LMM
in that the estimate of \sigma^2 becomes a sum of estimated variance
components, and rather than an identity matrix, there may be any
positive definite matrix in between the X' and the X (e.g. if weights
or corr arguments are used in lme we get non-Identity error
covariance). I tried to confirm this 'guess' by looking at the code
for the vcov method for mer objects, but my S4 skills are too limited
to know how to find it --- anyone? (and on a side note -- the fact
that this can (usually) be easily done is yet another reason why I
would be very happy to see Doug's future work to remain in R ;-))
As far as what it represents, as you'd guess the sqrt of the diagonals
are the SEs for the estimated coefficients and the off-diagonals are
the estimated covariances between those estimates. I suppose another
answer is that the off-diagonals provide indication of the amount of
collinearity in X.
I have two models that essentially give me the same message, but in one model the correlations between covariates are really high 0.9 and higher, while in other model use of poly(), reduced correlations a lot! Should I care?
Not surprisingly, when X has columns that are higher-order terms of another column collinearity occurs and the correlations between coefficients are high. 'poly' produces orthogonal polynomials, so covariances of the resulting coefficients should be essentially zero. The nice thing about that is that terms can be added/removed from the model without affecting the remaining estimates. On the other hand, when estimated coefficients are highly correlated their interpretations are confounded. hth, Kingsford Jones
Thanks! Lep pozdrav / With regards, Gregor Gorjanc ---------------------------------------------------------------------- University of Ljubljana PhD student Biotechnical Faculty www: http://gregor.gorjanc.googlepages.com Department of Animal Science blog: http://ggorjan.blogspot.com Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si SI-1230 Domzale fax: +386 (0)1 72 17 888 Slovenia, Europe tel: +386 (0)1 72 17 861
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Just to be complete, here's an example of getting the correlation of fixed effects from the covariance matrix:
example(lmer, package='lme4', echo=FALSE) fm1
Linear mixed model fit by REML
Formula: Reaction ~ Days + (Days | Subject)
Data: sleepstudy
AIC BIC logLik deviance REMLdev
1756 1775 -871.8 1752 1744
Random effects:
Groups Name Variance Std.Dev. Corr
Subject (Intercept) 612.092 24.7405
Days 35.072 5.9221 0.066
Residual 654.941 25.5918
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 251.405 6.825 36.84
Days 10.467 1.546 6.77
Correlation of Fixed Effects:
(Intr)
Days -0.138
vcov(fm1)
2 x 2 Matrix of class "dpoMatrix"
[,1] [,2]
[1,] 46.574676 -1.452393
[2,] -1.452393 2.389416
-1.4524/prod(sqrt(diag(vcov(fm1))))
[1] -0.1376783 On Mon, Feb 9, 2009 at 6:01 PM, Kingsford Jones
<kingsfordjones at gmail.com> wrote:
On Mon, Feb 9, 2009 at 2:52 PM, Gorjanc Gregor <Gregor.Gorjanc at bfro.uni-lj.si> wrote:
Hi! The default print method outputs also Correlation of Fixed Effects. How is this computed and what does it actually represent?
Hi Gregor,
In a standard LM it's calculated Cov(\beta) = \sigma^{2}(X'IX)^{-1},
where X is the model design matrix. In practice \sigma^2 is estimated
by the sum of squared residuals divided by the number of cols in X
minus its rank.
Although I'm guessing here, I assume the equation changes for an LMM
in that the estimate of \sigma^2 becomes a sum of estimated variance
components, and rather than an identity matrix, there may be any
positive definite matrix in between the X' and the X (e.g. if weights
or corr arguments are used in lme we get non-Identity error
covariance). I tried to confirm this 'guess' by looking at the code
for the vcov method for mer objects, but my S4 skills are too limited
to know how to find it --- anyone? (and on a side note -- the fact
that this can (usually) be easily done is yet another reason why I
would be very happy to see Doug's future work to remain in R ;-))
As far as what it represents, as you'd guess the sqrt of the diagonals
are the SEs for the estimated coefficients and the off-diagonals are
the estimated covariances between those estimates. I suppose another
answer is that the off-diagonals provide indication of the amount of
collinearity in X.
I have two models that essentially give me the same message, but in one model the correlations between covariates are really high 0.9 and higher, while in other model use of poly(), reduced correlations a lot! Should I care?
Not surprisingly, when X has columns that are higher-order terms of another column collinearity occurs and the correlations between coefficients are high. 'poly' produces orthogonal polynomials, so covariances of the resulting coefficients should be essentially zero. The nice thing about that is that terms can be added/removed from the model without affecting the remaining estimates. On the other hand, when estimated coefficients are highly correlated their interpretations are confounded. hth, Kingsford Jones
Thanks! Lep pozdrav / With regards, Gregor Gorjanc ---------------------------------------------------------------------- University of Ljubljana PhD student Biotechnical Faculty www: http://gregor.gorjanc.googlepages.com Department of Animal Science blog: http://ggorjan.blogspot.com Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si SI-1230 Domzale fax: +386 (0)1 72 17 888 Slovenia, Europe tel: +386 (0)1 72 17 861
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
3 days later
On Mon, Feb 9, 2009 at 3:52 PM, Gorjanc Gregor
<Gregor.Gorjanc at bfro.uni-lj.si> wrote:
The default print method outputs also Correlation of Fixed Effects. How is this computed and what does it actually represent?
It is an approximate correlation of the estimator of the fixed effects. (I include the word "approximate" because I should but in this case the approximation is very good.) I'm not sure how to explain it better than that. Suppose that you took an MCMC sample from the parameters in the model, then you would expect the sample of the fixed-effects parameters to display a correlation structure like this matrix. As for how it is calculated, look in the vignettes for the definition of a p by p upper triangular matrix called R_X. It is returned as the RX slot in the fitted model. This matrix is part of the Cholesky factor in the combined model matrices for the penalized least squares problem that determines the conditional modes of the random effects and the conditional estimates of the fixed effects. If we didn't have any random effects this would be the R matrix from the QR decomposition of X. The same calculation that creates the correlation of the coefficients in a fixed-effects model from R creates the correlation of the fixed-effects coefficients from RX here. See ?chol2inv
I have two models that essentially give me the same message, but in one model the correlations between covariates are really high 0.9 and higher, while in other model use of poly(), reduced correlations a lot! Should I care? Thanks! Lep pozdrav / With regards, Gregor Gorjanc ---------------------------------------------------------------------- University of Ljubljana PhD student Biotechnical Faculty www: http://gregor.gorjanc.googlepages.com Department of Animal Science blog: http://ggorjan.blogspot.com Groblje 3 mail: gregor.gorjanc <at> bfro.uni-lj.si SI-1230 Domzale fax: +386 (0)1 72 17 888 Slovenia, Europe tel: +386 (0)1 72 17 861 ----------------------------------------------------------------------