Question regarding the estimation of the correlation between random intercepts and random slopes in the lme4 package
On 14-07-16 08:37 AM, Inga Schwabe wrote:
Dear list members, I posted a question regarding the lme4 package in R on stackoverflow.com (see http://stackoverflow.com/questions/24778642/estimating-correlation-between-random-slopes-and-random-intercepts-using-the-lme) and a stackoverflow-user kindly referred me to this mailing list. I hope that my question is appropriate for this mailing list. If it is not, please let me know! This is my question: For answering my research question I am interested in the correlation between the random slopes and random intercepts in a multilevel model, estimated using the R library lme4. The data I have is: Y (test-scores from students), SES (socio-economic status for each student) and schoolid (ID for each school). I am using the following syntax to estimate random intercepts and slopes for the schools: library(lme4) model3 <- lmer(Y ~ SES + (1 + SES | schoolid)) The reference I used for this syntax is this pdf: http://www.bristol.ac.uk/cmm/learning/module-samples/5-concepts-sample.pdf On page 19, a similar analysis is described. It is said that by defining the random intercepts and slopes toghether, it is indirectly specified that we want the random intercepts and slopes to covary. Therefore, also the correlation between random slopes and random intercepts is estimated. Basically, exactly what I need for answering my research hyptohesis. However, when I look at the results: summary(model3) I am getting the following output: Linear mixed model fit by REML ['lmerMod'] Formula: Y ~ SES + (1 + SES | schoolid) REML criterion at convergence: 8256.4 Scaled residuals: Min 1Q Median 3Q Max -3.1054 -0.6633 -0.0028 0.6810 3.5606 Random effects: Groups Name Variance Std.Dev. Corr schoolid (Intercept) 0.6427924 0.80174 SES 0.0009143 0.03024 1.00 Residual 0.3290902 0.57366 Number of obs: 4376, groups: schoolid, 179 Fixed effects: Estimate Std. Error t value(Intercept) -0.036532 0.060582 -0.603 SES 0.062491 0.009984 6.259 Correlation of Fixed Effects: (Intr) SES 0.226 As stated in the output, the correlation between the random slopes and random intercepts equals 1.00. I find this hard to believe. When I call in R: VarCorr(model3)$schoolid I am getting the following output which gives the correlations and covariance matrix: (Intercept) SES(Intercept) 0.64279243 0.0242429680 SES 0.02424297 0.0009143255 attr(,"stddev")(Intercept) SES 0.80174337 0.03023782 attr(,"correlation") (Intercept) SES(Intercept) 1 1 SES 1 1 It seems as if the correlation between the slopes and intercepts was set to 1.00 by R. I did not see this in the output of anyone else when I was searching the internet on references on multilevel modelling. Does anybody know what can be the cause of this correlation? Can it be that the correlation is set to 1.00 because otherwise the model would not be identified? Or is it because the variance of the random slopes is so small (0.0009) that the correlation can not be estimated? I have tried to simulate data in order provide the code for a small reproducible dataset. I was however not yet able to reproduce this output by means of simulated data. As far as I have the code I will eidt my post and add the code. Many thanks for you time and effort, Kind regards, Inga
I was thinking about answering on StackOverflow, but since you've asked here: yes, your second-to-last paragraph is exactly correct. You don't have enough information (or equivalently there is too much noise) in your data to uniquely identify the full variance-covariance matrix, so the results are 'singular'; that is, one of the underlying components is identified as zero. Common symptoms of this situation include either variances equal to zero (or nearly equal, although I see in your case that the variances while small are *not* exactly zero) or correlations equal to +/- 1. http://rpubs.com/bbolker/4187 shows some simulated examples where the estimated variance collapses to zero (despite the true, simulated model having a non-zero among-group variance). Presumably one could make up similar examples (with randomized-block designs/random-slope models) that would show analogous situations of correlations collapsing to +/- 1. It's a little bit surprising that you have encountered this problem since it is most commonly a problem with small numbers of levels in the grouping variables, which doesn't appear to be the case here. See http://glmm.wikidot.com/faq#singular_fits for more information ... Ben Bolker