Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang
Understanding variance components
8 messages · Henrik Singmann, Phillip Alday, Gang Chen +1 more
8 days later
Hi Gang, Sorry that I so am late to the party, but in case you are still interested I will reply (and, of course, for the archive). The answer is basically given in the old faq: http://glmm.wikidot.com/faq#toc27 (1|site/block) = (1|site)+(1|site:block) Which is exactly what is given in your output. A random intercept for Worker and a random intercept for each worker:Machine interaction. To answer your questions. The random intercepts do not have base or reference levels. They are increments or decrements to the overall intercept for each level of Worker or the Machine:Worker combination. The reported variance is the estimated variance of these increments, which is most likely unequal to the actual variance you would obtain by calculating it from the estimated increments, which are sometimes called BLUPs (I wonder if a better term for those exist). Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]:
Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
On Fri, 2017-01-13 at 18:06 +0100, Henrik Singmann wrote:
Hi Gang,
:snip:
The reported variance is the estimated variance of these increments,? which is most likely unequal to the actual variance you would obtain by? calculating it from the estimated increments, which are sometimes called? BLUPs (I wonder if a better term for those exist).
I've seen them called "conditional modes" (in a vaguely Bayesian sense) at some point by Doug Bates because of course they're not linear for the GLMM case. Best, Phillip
Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]:
Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: ?Groups?????????Name????????Variance ?Machine:Worker (Intercept) 46.00 ?Worker?????????(Intercept) 13.84 ?Residual????????????????????1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: ?Groups?????????Name????????Variance ?Machine:Worker (Intercept) 60.2972 ?Worker?????????(Intercept)??7.3959 ?Residual????????????????????0.9246 Thanks, Gang
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
4 days later
Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I posted the question, I realized that my question was silly! Once I laid out the LME model equation, my original confusion was resolved. Actually I meant to ask a slightly different question. Let me use the dataset embedded in your ?afex? package as an example: data(obk.long, package = "afex?) Suppose that my base model is lmer(value ~ gender*phase+(1|id), data=obk.long) Is there a way to specify a different variance for each gender in one model? Thanks, Gang
On Jan 13, 2017, at 12:06 PM, Henrik Singmann <singmann at psychologie.uzh.ch> wrote: Hi Gang, Sorry that I so am late to the party, but in case you are still interested I will reply (and, of course, for the archive). The answer is basically given in the old faq: http://glmm.wikidot.com/faq#toc27 (1|site/block) = (1|site)+(1|site:block) Which is exactly what is given in your output. A random intercept for Worker and a random intercept for each worker:Machine interaction. To answer your questions. The random intercepts do not have base or reference levels. They are increments or decrements to the overall intercept for each level of Worker or the Machine:Worker combination. The reported variance is the estimated variance of these increments, which is most likely unequal to the actual variance you would obtain by calculating it from the estimated increments, which are sometimes called BLUPs (I wonder if a better term for those exist). Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]:
Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Hi Gang, I have an idea which is based on the last example given on ?lmer: ## Fit sex-specific variances by constructing numeric dummy variables ... I am not sure if this is entirely correct, but it looks good to me. If not, hopefully someone more knowledgeable will jump in. ## Original model without gender specific variance: data(obk.long, package = "afex") m1 <- lmer(value ~ gender*phase+(1|id), data=obk.long) summary(m1)$varcor ## Groups Name Std.Dev. ## id (Intercept) 1.6018 ## Residual 1.4820 REMLcrit(m1) ## [1] 911.1599 ## to get gender specific vari8ances, we construct two dummy variables: obk.long$gender_F <- as.numeric(obk.long$gender == "F") obk.long$gender_M <- as.numeric(obk.long$gender == "M") m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor ## Groups Name Std.Dev. ## id gender_F 0.99723 ## id.1 gender_M 2.03404 ## Residual 1.48196 REMLcrit(m2) ## [1] 908.297 So far, looks reasonably close. Same for the conditional modes (thx Phillip). Left two columns is separate, right is joint variance. cbind(ranef(m2)$id, rep(NA, 16), ranef(m1)$id) ## gender_F gender_M rep(NA, 16) (Intercept) ## 1 0.0000000 -3.0986753 NA -3.0351426 ## 2 0.0000000 -2.1328544 NA -2.0891242 ## 3 0.0000000 0.1207276 NA 0.1182523 ## 4 -0.9806229 0.0000000 NA -1.0642708 ## 5 -0.3995130 0.0000000 NA -0.4335918 ## 6 0.0000000 2.6962499 NA 2.6409683 ## 7 0.0000000 1.4084888 NA 1.3796103 ## 8 -0.3995130 0.0000000 NA -0.4335918 ## 9 -0.6900680 0.0000000 NA -0.7489313 ## 10 0.0000000 0.4426679 NA 0.4335918 ## 11 0.0000000 -1.1670336 NA -1.1431057 ## 12 0.0000000 1.7304291 NA 1.6949498 ## 13 1.3438166 0.0000000 NA 1.4584452 ## 14 -0.6900680 0.0000000 NA -0.7489313 ## 15 0.4721518 0.0000000 NA 0.5124267 ## 16 1.3438166 0.0000000 NA 1.4584452 And, finally, the same for the fixed effects: fixef(m1) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 -1.243450e-15 -1.687539e-15 fixef(m2) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 1.829648e-14 1.820766e-14 Hope that helps, Henrik Am 18.01.2017 um 23:18 schrieb Chen, Gang (NIH/NIMH) [C]:
Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I posted the question, I realized that my question was silly! Once I laid out the LME model equation, my original confusion was resolved. Actually I meant to ask a slightly different question. Let me use the dataset embedded in your ?afex? package as an example: data(obk.long, package = "afex?) Suppose that my base model is lmer(value ~ gender*phase+(1|id), data=obk.long) Is there a way to specify a different variance for each gender in one model? Thanks, Gang
On Jan 13, 2017, at 12:06 PM, Henrik Singmann <singmann at psychologie.uzh.ch> wrote: Hi Gang, Sorry that I so am late to the party, but in case you are still interested I will reply (and, of course, for the archive). The answer is basically given in the old faq: http://glmm.wikidot.com/faq#toc27 (1|site/block) = (1|site)+(1|site:block) Which is exactly what is given in your output. A random intercept for Worker and a random intercept for each worker:Machine interaction. To answer your questions. The random intercepts do not have base or reference levels. They are increments or decrements to the overall intercept for each level of Worker or the Machine:Worker combination. The reported variance is the estimated variance of these increments, which is most likely unequal to the actual variance you would obtain by calculating it from the estimated increments, which are sometimes called BLUPs (I wonder if a better term for those exist). Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]:
Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Nice, Henrik! One thing we need to resolve, though, is this. ==================== The variances for model m1: (1) intercept summary(m1)$varcor$id[1] [1] 2.565887 (2) residuals attr(summary(m1)$varcor, "sc")^2 [1] 2.196212 ==================== And the variances for model m2: (1) female summary(m2)$varcor$id[1] [1] 0.9944589 (2) male summary(m2)$varcor$id.1[1] [1] 4.137316 (3) residuals attr(summary(m2)$varcor, "sc")^2 [1] 2.196212 ==================== It?s great to see that the residual variance matches between the two models m1 and m2. However, the intercept variance in m1, 2.565887, is not equal to the sum of female and male variances, 0.9944589 + 4.137316 = 5.131775. However, if we divide the total variance of female and male by 2, we have (0.9944589 + 4.137316)/2 = 2.565887. Why is that? If we code the two groups as obk.long$gender_F <- sqrt(2)*as.numeric(obk.long$gender == "F") obk.long$gender_M <- sqrt(2)*as.numeric(obk.long$gender == "M?) then we have the desired result, m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor$id[1]+summary(m2)$varcor$id.1[1] [1] 2.565888 Even though the variance part is reconciled, I cannot come up with a good explanation as to why this coding strategy is required. Any thought? Thanks, Gang
On Jan 19, 2017, at 6:13 AM, Henrik Singmann <singmann at psychologie.uzh.ch<mailto:singmann at psychologie.uzh.ch>> wrote:
Hi Gang, I have an idea which is based on the last example given on ?lmer: ## Fit sex-specific variances by constructing numeric dummy variables ... I am not sure if this is entirely correct, but it looks good to me. If not, hopefully someone more knowledgeable will jump in. ## Original model without gender specific variance: data(obk.long, package = "afex") m1 <- lmer(value ~ gender*phase+(1|id), data=obk.long) summary(m1)$varcor ## Groups Name Std.Dev. ## id (Intercept) 1.6018 ## Residual 1.4820 REMLcrit(m1) ## [1] 911.1599 ## to get gender specific vari8ances, we construct two dummy variables: obk.long$gender_F <- as.numeric(obk.long$gender == "F") obk.long$gender_M <- as.numeric(obk.long$gender == "M") m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor ## Groups Name Std.Dev. ## id gender_F 0.99723 ## id.1 gender_M 2.03404 ## Residual 1.48196 REMLcrit(m2) ## [1] 908.297 So far, looks reasonably close. Same for the conditional modes (thx Phillip). Left two columns is separate, right is joint variance. cbind(ranef(m2)$id, rep(NA, 16), ranef(m1)$id) ## gender_F gender_M rep(NA, 16) (Intercept) ## 1 0.0000000 -3.0986753 NA -3.0351426 ## 2 0.0000000 -2.1328544 NA -2.0891242 ## 3 0.0000000 0.1207276 NA 0.1182523 ## 4 -0.9806229 0.0000000 NA -1.0642708 ## 5 -0.3995130 0.0000000 NA -0.4335918 ## 6 0.0000000 2.6962499 NA 2.6409683 ## 7 0.0000000 1.4084888 NA 1.3796103 ## 8 -0.3995130 0.0000000 NA -0.4335918 ## 9 -0.6900680 0.0000000 NA -0.7489313 ## 10 0.0000000 0.4426679 NA 0.4335918 ## 11 0.0000000 -1.1670336 NA -1.1431057 ## 12 0.0000000 1.7304291 NA 1.6949498 ## 13 1.3438166 0.0000000 NA 1.4584452 ## 14 -0.6900680 0.0000000 NA -0.7489313 ## 15 0.4721518 0.0000000 NA 0.5124267 ## 16 1.3438166 0.0000000 NA 1.4584452 And, finally, the same for the fixed effects: fixef(m1) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 -1.243450e-15 -1.687539e-15 fixef(m2) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 1.829648e-14 1.820766e-14 Hope that helps, Henrik
On Jan 18, 2017, at 5:18 PM, Chen, Gang (NIH/NIMH) [C] <gangchen at mail.nih.gov<mailto:gangchen at mail.nih.gov>> wrote:
Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I posted the question, I realized that my question was silly! Once I laid out the LME model equation, my original confusion was resolved. Actually I meant to ask a slightly different question. Let me use the dataset embedded in your ?afex? package as an example: data(obk.long, package = "afex?) Suppose that my base model is lmer(value ~ gender*phase+(1|id), data=obk.long) Is there a way to specify a different variance for each gender in one model? Thanks, Gang
On Jan 13, 2017, at 12:06 PM, Henrik Singmann <singmann at psychologie.uzh.ch<mailto:singmann at psychologie.uzh.ch>> wrote:
Hi Gang, Sorry that I so am late to the party, but in case you are still interested I will reply (and, of course, for the archive). The answer is basically given in the old faq: http://glmm.wikidot.com/faq#toc27 (1|site/block) = (1|site)+(1|site:block) Which is exactly what is given in your output. A random intercept for Worker and a random intercept for each worker:Machine interaction. To answer your questions. The random intercepts do not have base or reference levels. They are increments or decrements to the overall intercept for each level of Worker or the Machine:Worker combination. The reported variance is the estimated variance of these increments, which is most likely unequal to the actual variance you would obtain by calculating it from the estimated increments, which are sometimes called BLUPs (I wonder if a better term for those exist). Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]: Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Hey Gang, hey Henrik (!) :) very insightful query, hope it is okay that I briefly join it, on the last question. Variance is a result of division of squared deviations by sample size. In the complete sample, this means divided by 240; and in the gender samples by 120 each. Have a look on the reversed equations for obtaining the overall variance by combining gender groups, weighted by sample size: (120 * 0.9944589 + 4.137316 * 120)/240 = overall variance (should be equal to) = (0.9944589 + 4.137316)/2 acording to your equation gang left hand can be written as: 120 * (0.9944589 + 4.137316) / 240 which is: (0.9944589 + 4.137316)/2 so nothing to worry, I think :) Best wishes, Ren? 2017-01-19 16:14 GMT+01:00 Chen, Gang (NIH/NIMH) [C] <gangchen at mail.nih.gov> :
Nice, Henrik! One thing we need to resolve, though, is this. ==================== The variances for model m1: (1) intercept summary(m1)$varcor$id[1] [1] 2.565887 (2) residuals attr(summary(m1)$varcor, "sc")^2 [1] 2.196212 ==================== And the variances for model m2: (1) female summary(m2)$varcor$id[1] [1] 0.9944589 (2) male summary(m2)$varcor$id.1[1] [1] 4.137316 (3) residuals attr(summary(m2)$varcor, "sc")^2 [1] 2.196212 ==================== It?s great to see that the residual variance matches between the two models m1 and m2. However, the intercept variance in m1, 2.565887, is not equal to the sum of female and male variances, 0.9944589 + 4.137316 = 5.131775. However, if we divide the total variance of female and male by 2, we have (0.9944589 + 4.137316)/2 = 2.565887. Why is that? If we code the two groups as obk.long$gender_F <- sqrt(2)*as.numeric(obk.long$gender == "F") obk.long$gender_M <- sqrt(2)*as.numeric(obk.long$gender == "M?) then we have the desired result, m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor$id[1]+summary(m2)$varcor$id.1[1] [1] 2.565888 Even though the variance part is reconciled, I cannot come up with a good explanation as to why this coding strategy is required. Any thought? Thanks, Gang On Jan 19, 2017, at 6:13 AM, Henrik Singmann <singmann at psychologie.uzh.ch< mailto:singmann at psychologie.uzh.ch>> wrote: Hi Gang, I have an idea which is based on the last example given on ?lmer: ## Fit sex-specific variances by constructing numeric dummy variables ... I am not sure if this is entirely correct, but it looks good to me. If not, hopefully someone more knowledgeable will jump in. ## Original model without gender specific variance: data(obk.long, package = "afex") m1 <- lmer(value ~ gender*phase+(1|id), data=obk.long) summary(m1)$varcor ## Groups Name Std.Dev. ## id (Intercept) 1.6018 ## Residual 1.4820 REMLcrit(m1) ## [1] 911.1599 ## to get gender specific vari8ances, we construct two dummy variables: obk.long$gender_F <- as.numeric(obk.long$gender == "F") obk.long$gender_M <- as.numeric(obk.long$gender == "M") m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor ## Groups Name Std.Dev. ## id gender_F 0.99723 ## id.1 gender_M 2.03404 ## Residual 1.48196 REMLcrit(m2) ## [1] 908.297 So far, looks reasonably close. Same for the conditional modes (thx Phillip). Left two columns is separate, right is joint variance. cbind(ranef(m2)$id, rep(NA, 16), ranef(m1)$id) ## gender_F gender_M rep(NA, 16) (Intercept) ## 1 0.0000000 -3.0986753 NA -3.0351426 ## 2 0.0000000 -2.1328544 NA -2.0891242 ## 3 0.0000000 0.1207276 NA 0.1182523 ## 4 -0.9806229 0.0000000 NA -1.0642708 ## 5 -0.3995130 0.0000000 NA -0.4335918 ## 6 0.0000000 2.6962499 NA 2.6409683 ## 7 0.0000000 1.4084888 NA 1.3796103 ## 8 -0.3995130 0.0000000 NA -0.4335918 ## 9 -0.6900680 0.0000000 NA -0.7489313 ## 10 0.0000000 0.4426679 NA 0.4335918 ## 11 0.0000000 -1.1670336 NA -1.1431057 ## 12 0.0000000 1.7304291 NA 1.6949498 ## 13 1.3438166 0.0000000 NA 1.4584452 ## 14 -0.6900680 0.0000000 NA -0.7489313 ## 15 0.4721518 0.0000000 NA 0.5124267 ## 16 1.3438166 0.0000000 NA 1.4584452 And, finally, the same for the fixed effects: fixef(m1) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 -1.243450e-15 -1.687539e-15 fixef(m2) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 1.829648e-14 1.820766e-14 Hope that helps, Henrik On Jan 18, 2017, at 5:18 PM, Chen, Gang (NIH/NIMH) [C] < gangchen at mail.nih.gov<mailto:gangchen at mail.nih.gov>> wrote: Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I posted the question, I realized that my question was silly! Once I laid out the LME model equation, my original confusion was resolved. Actually I meant to ask a slightly different question. Let me use the dataset embedded in your ?afex? package as an example: data(obk.long, package = "afex?) Suppose that my base model is lmer(value ~ gender*phase+(1|id), data=obk.long) Is there a way to specify a different variance for each gender in one model? Thanks, Gang On Jan 13, 2017, at 12:06 PM, Henrik Singmann <singmann at psychologie.uzh.ch <mailto:singmann at psychologie.uzh.ch>> wrote: Hi Gang, Sorry that I so am late to the party, but in case you are still interested I will reply (and, of course, for the archive). The answer is basically given in the old faq: http://glmm.wikidot.com/faq#toc27 (1|site/block) = (1|site)+(1|site:block) Which is exactly what is given in your output. A random intercept for Worker and a random intercept for each worker:Machine interaction. To answer your questions. The random intercepts do not have base or reference levels. They are increments or decrements to the overall intercept for each level of Worker or the Machine:Worker combination. The reported variance is the estimated variance of these increments, which is most likely unequal to the actual variance you would obtain by calculating it from the estimated increments, which are sometimes called BLUPs (I wonder if a better term for those exist). Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]: Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models [[alternative HTML version deleted]] _______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
3 days later
Thanks Ren? for the comment. I'm still puzzled by the fact that the variance decomposition cannot seem to be directly reconciled through the two models themselves, and I hope that someone can offer a better way to interpret this. Gang
From: Ren? [bimonosom at gmail.com]
Sent: Thursday, January 19, 2017 10:55 AM
To: Chen, Gang (NIH/NIMH) [C]
Cc: Henrik Singmann; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Understanding variance components
Sent: Thursday, January 19, 2017 10:55 AM
To: Chen, Gang (NIH/NIMH) [C]
Cc: Henrik Singmann; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Understanding variance components
Hey Gang, hey Henrik (!) :) very insightful query, hope it is okay that I briefly join it, on the last question. Variance is a result of division of squared deviations by sample size. In the complete sample, this means divided by 240; and in the gender samples by 120 each. Have a look on the reversed equations for obtaining the overall variance by combining gender groups, weighted by sample size: (120 * 0.9944589 + 4.137316 * 120)/240 = overall variance (should be equal to) = (0.9944589 + 4.137316)/2 acording to your equation gang left hand can be written as: 120 * (0.9944589 + 4.137316) / 240 which is: (0.9944589 + 4.137316)/2 so nothing to worry, I think :) Best wishes, Ren? 2017-01-19 16:14 GMT+01:00 Chen, Gang (NIH/NIMH) [C] <gangchen at mail.nih.gov<mailto:gangchen at mail.nih.gov>>: Nice, Henrik! One thing we need to resolve, though, is this. ==================== The variances for model m1: (1) intercept summary(m1)$varcor$id[1] [1] 2.565887 (2) residuals attr(summary(m1)$varcor, "sc")^2 [1] 2.196212 ==================== And the variances for model m2: (1) female summary(m2)$varcor$id[1] [1] 0.9944589 (2) male summary(m2)$varcor$id.1[1] [1] 4.137316 (3) residuals attr(summary(m2)$varcor, "sc")^2 [1] 2.196212 ==================== It?s great to see that the residual variance matches between the two models m1 and m2. However, the intercept variance in m1, 2.565887, is not equal to the sum of female and male variances, 0.9944589 + 4.137316 = 5.131775. However, if we divide the total variance of female and male by 2, we have (0.9944589 + 4.137316)/2 = 2.565887. Why is that? If we code the two groups as obk.long$gender_F <- sqrt(2)*as.numeric(obk.long$gender == "F") obk.long$gender_M <- sqrt(2)*as.numeric(obk.long$gender == "M?) then we have the desired result, m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor$id[1]+summary(m2)$varcor$id.1[1] [1] 2.565888 Even though the variance part is reconciled, I cannot come up with a good explanation as to why this coding strategy is required. Any thought? Thanks, Gang On Jan 19, 2017, at 6:13 AM, Henrik Singmann <singmann at psychologie.uzh.ch<mailto:singmann at psychologie.uzh.ch><mailto:singmann at psychologie.uzh.ch<mailto:singmann at psychologie.uzh.ch>>> wrote: Hi Gang, I have an idea which is based on the last example given on ?lmer: ## Fit sex-specific variances by constructing numeric dummy variables ... I am not sure if this is entirely correct, but it looks good to me. If not, hopefully someone more knowledgeable will jump in. ## Original model without gender specific variance: data(obk.long, package = "afex") m1 <- lmer(value ~ gender*phase+(1|id), data=obk.long) summary(m1)$varcor ## Groups Name Std.Dev. ## id (Intercept) 1.6018 ## Residual 1.4820 REMLcrit(m1) ## [1] 911.1599 ## to get gender specific vari8ances, we construct two dummy variables: obk.long$gender_F <- as.numeric(obk.long$gender == "F") obk.long$gender_M <- as.numeric(obk.long$gender == "M") m2 <- lmer(value ~ gender*phase+(0+gender_F|id)+(0+gender_M|id), data=obk.long) summary(m2)$varcor ## Groups Name Std.Dev. ## id gender_F 0.99723 ## id.1 gender_M 2.03404 ## Residual 1.48196 REMLcrit(m2) ## [1] 908.297 So far, looks reasonably close. Same for the conditional modes (thx Phillip). Left two columns is separate, right is joint variance. cbind(ranef(m2)$id, rep(NA, 16), ranef(m1)$id) ## gender_F gender_M rep(NA, 16) (Intercept) ## 1 0.0000000 -3.0986753 NA -3.0351426 ## 2 0.0000000 -2.1328544 NA -2.0891242 ## 3 0.0000000 0.1207276 NA 0.1182523 ## 4 -0.9806229 0.0000000 NA -1.0642708 ## 5 -0.3995130 0.0000000 NA -0.4335918 ## 6 0.0000000 2.6962499 NA 2.6409683 ## 7 0.0000000 1.4084888 NA 1.3796103 ## 8 -0.3995130 0.0000000 NA -0.4335918 ## 9 -0.6900680 0.0000000 NA -0.7489313 ## 10 0.0000000 0.4426679 NA 0.4335918 ## 11 0.0000000 -1.1670336 NA -1.1431057 ## 12 0.0000000 1.7304291 NA 1.6949498 ## 13 1.3438166 0.0000000 NA 1.4584452 ## 14 -0.6900680 0.0000000 NA -0.7489313 ## 15 0.4721518 0.0000000 NA 0.5124267 ## 16 1.3438166 0.0000000 NA 1.4584452 And, finally, the same for the fixed effects: fixef(m1) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 -1.243450e-15 -1.687539e-15 fixef(m2) ## (Intercept) genderM phasepost ## 6.000000e+00 7.500000e-01 -6.250000e-01 ## phasepre genderM:phasepost genderM:phasepre ## -2.000000e+00 1.829648e-14 1.820766e-14 Hope that helps, Henrik On Jan 18, 2017, at 5:18 PM, Chen, Gang (NIH/NIMH) [C] <gangchen at mail.nih.gov<mailto:gangchen at mail.nih.gov><mailto:gangchen at mail.nih.gov<mailto:gangchen at mail.nih.gov>>> wrote: Happy New Year, Henrik! Thanks for explaining the details. A couple of days after I posted the question, I realized that my question was silly! Once I laid out the LME model equation, my original confusion was resolved. Actually I meant to ask a slightly different question. Let me use the dataset embedded in your ?afex? package as an example: data(obk.long, package = "afex?) Suppose that my base model is lmer(value ~ gender*phase+(1|id), data=obk.long) Is there a way to specify a different variance for each gender in one model? Thanks, Gang On Jan 13, 2017, at 12:06 PM, Henrik Singmann <singmann at psychologie.uzh.ch<mailto:singmann at psychologie.uzh.ch><mailto:singmann at psychologie.uzh.ch<mailto:singmann at psychologie.uzh.ch>>> wrote: Hi Gang, Sorry that I so am late to the party, but in case you are still interested I will reply (and, of course, for the archive). The answer is basically given in the old faq: http://glmm.wikidot.com/faq#toc27 (1|site/block) = (1|site)+(1|site:block) Which is exactly what is given in your output. A random intercept for Worker and a random intercept for each worker:Machine interaction. To answer your questions. The random intercepts do not have base or reference levels. They are increments or decrements to the overall intercept for each level of Worker or the Machine:Worker combination. The reported variance is the estimated variance of these increments, which is most likely unequal to the actual variance you would obtain by calculating it from the estimated increments, which are sometimes called BLUPs (I wonder if a better term for those exist). Hope that helps, Henrik PS: Belated Happy New Year to everyone. Am 05.01.2017 um 17:28 schrieb Chen, Gang (NIH/NIMH) [C]: Suppose that I have the following dataset in R: library(lme4) data(Machines,package="nlme") mydata <- Machines[Machines$Machine!='C?,] With the following model: print(lmer(score ~ 1 + (1|Worker/Machine), data=mydata), ranef.comp="Var") I have the variance components as shown below: Random effects: Groups Name Variance Machine:Worker (Intercept) 46.00 Worker (Intercept) 13.84 Residual 1.16 I have trouble understanding exactly what the first two components are: Machine:Worker and Worker? Specifically, 1) What is the variance for Worker: corresponding to the base (or reference) level of the factor Machine? If so, what is the base level: the first level in the dataset or alphabetically the first level (it happens to be the same in this particular dataset)? 2) What is the variance for Machine:Worker? Is it the variance for the second level of the factor Machine, or the extra variance relative to the variance for Worker? Furthermore, for the model: print(lmer(score ~ 1 + (1|Worker/Machine), data=Machines), ranef.comp="Var") what is the variance for Machine:Worker in the following result since there are 3 levels involved in the factor Machine? Random effects: Groups Name Variance Machine:Worker (Intercept) 60.2972 Worker (Intercept) 7.3959 Residual 0.9246 Thanks, Gang _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org><mailto:R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org>> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models [[alternative HTML version deleted]] _______________________________________________ R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models