Dear all, I have a nested data structure and I would like to estimate factor-specific random effect variance. Inspired by the article: http://rstudio-pubs-static.s3.amazonaws.com/6B98_c0ae95A0AAA44A3Aa8a9b6ba9703fcf5.html<http://rstudio-pubs-static.s3.amazonaws.com/6298_c0ae951011144131a8a9b6ba9703fcf5.html> I conducted some simulations: ############### ##--generate data----- ngroup <- 25 nrep<- 5 ##----random effect of group--- meanh <- 0 sigmahA <- 1 ##--these are the two assigned variance of the random factor group for each class (A, B) sigmahB <- 3 ##----residual random error--- meanR <- 0 sigmaR <- 0.2 ##---simulate smaple-------------------------------------------------- set.seed(55) out_A <- NA out_B <- NA raneff_groupA <- rnorm(ngroup, meanh, sigmahA) raneff_groupB <- rnorm(ngroup, meanh, sigmahB) groupID <- seq(1, ngroup) for (i in 1:ngroup) { raneff_red_A <- rnorm(nrep, meanR, sigmaR) temp_A <- 0 + raneff_groupA[i] + raneff_red_A raneff_red_B <- rnorm(nrep, meanR, sigmaR) temp_B <- 1 + raneff_groupB[i] + raneff_red_B out_A <- rbind(out_A, cbind(groupID[i], raneff_groupA[i], raneff_red_A, temp_A)) out_B <- rbind(out_B, cbind(groupID[i], raneff_groupB[i], raneff_red_B, temp_B)) } out_A <- out_A[-1,] out_B <- out_B[-1,] dat <- data.frame(groupID=c(out_A[,1], out_B[,1]), prog=c(rep("A",nrow(out_A)), rep("B",nrow(out_B))), out=c(out_A[,"temp_A"], out_B[,"temp_B"]), raneff_haul=c(out_A[,2], out_B[,2]), error=c(out_A[,3], out_B[,3])) library(lattice) bwplot(out~paste(prog,groupID), data=dat) ##--apply model----- fit2 <- lme(out ~ prog-1, random = list(groupID = pdDiag(~prog)), method = "REML", data = dat) summary(fit2) This code gives the expected model output: as estimated variance of groups for A and B are: 1.07957 2.835139, similar to assigned value ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 202.7263 220.2934 -96.36314 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 1.07957 2.835139 0.1959905 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.0316535 0.2166244 224 0.1461215 0.8840 progB 1.1446339 0.6069982 224 1.8857287 0.0606 Correlation: progA progB 0.355 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.29868033 -0.63163428 0.03981236 0.64863494 2.63613133 Number of Observations: 250 Number of Groups: 25 ######################### However, it I dont make any other changes, but simply to switch the order of the simulation variance: sigmahA <- 3 sigmahB <- 1 , generate the data and apply the model, the model gives very confusing result: ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 261.763 279.3301 -125.8815 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 3.187986 3.295383 0.1959864 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.1455012 0.6378382 224 0.2281162 0.8198 progB 1.0590270 0.9171802 224 1.1546553 0.2495 Correlation: progA progB 0.695 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.30847626 -0.63592816 0.03830767 0.64349268 2.63737406 Number of Observations: 250 Number of Groups: 25 ######################### The model output gives equal variance. The same situation happens if I run the same code in the article (link above) and also simply change the order of the variance assigned to the groups. Can someone help me to figure out why the order of assigning the variance matters? And why the second case does not gives the expected results? Thanks Regards, Chun Chen
simulation of factor-specific random effect variance estimation, the order of assigned variance matters?
4 messages · Chen Chun, Paul Debes
Hi Chen, I think you may only have to change the random effect specification from variance contrast to variance means: random = list(groupID = pdDiag(~prog)) to random = list(groupID = pdDiag(~0+prog)) It may also be good to change "groupID" from a continuous covariate to a factor. This should give the expected standard deviations for the random-term groups in both simulations. Just as a note, please be aware that you do not specify variance but standard deviation in your simulations (model output is also in standard deviation). Best, Paul On Wed, 27 Jul 2016 23:48:01 +0300, Chen Chun <talischen at hotmail.com> wrote:
Dear all, I have a nested data structure and I would like to estimate factor-specific random effect variance. Inspired by the article: http://rstudio-pubs-static.s3.amazonaws.com/6B98_c0ae95A0AAA44A3Aa8a9b6ba9703fcf5.html<http://rstudio-pubs-static.s3.amazonaws.com/6298_c0ae951011144131a8a9b6ba9703fcf5.html> I conducted some simulations: ############### ##--generate data----- ngroup <- 25 nrep<- 5 ##----random effect of group--- meanh <- 0 sigmahA <- 1 ##--these are the two assigned variance of the random factor group for each class (A, B) sigmahB <- 3 ##----residual random error--- meanR <- 0 sigmaR <- 0.2 ##---simulate smaple-------------------------------------------------- set.seed(55) out_A <- NA out_B <- NA raneff_groupA <- rnorm(ngroup, meanh, sigmahA) raneff_groupB <- rnorm(ngroup, meanh, sigmahB) groupID <- seq(1, ngroup) for (i in 1:ngroup) { raneff_red_A <- rnorm(nrep, meanR, sigmaR) temp_A <- 0 + raneff_groupA[i] + raneff_red_A raneff_red_B <- rnorm(nrep, meanR, sigmaR) temp_B <- 1 + raneff_groupB[i] + raneff_red_B out_A <- rbind(out_A, cbind(groupID[i], raneff_groupA[i], raneff_red_A, temp_A)) out_B <- rbind(out_B, cbind(groupID[i], raneff_groupB[i], raneff_red_B, temp_B)) } out_A <- out_A[-1,] out_B <- out_B[-1,] dat <- data.frame(groupID=c(out_A[,1], out_B[,1]), prog=c(rep("A",nrow(out_A)), rep("B",nrow(out_B))), out=c(out_A[,"temp_A"], out_B[,"temp_B"]), raneff_haul=c(out_A[,2], out_B[,2]), error=c(out_A[,3], out_B[,3])) library(lattice) bwplot(out~paste(prog,groupID), data=dat) ##--apply model----- fit2 <- lme(out ~ prog-1, random = list(groupID = pdDiag(~prog)), method = "REML", data = dat) summary(fit2) This code gives the expected model output: as estimated variance of groups for A and B are: 1.07957 2.835139, similar to assigned value ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 202.7263 220.2934 -96.36314 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 1.07957 2.835139 0.1959905 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.0316535 0.2166244 224 0.1461215 0.8840 progB 1.1446339 0.6069982 224 1.8857287 0.0606 Correlation: progA progB 0.355 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.29868033 -0.63163428 0.03981236 0.64863494 2.63613133 Number of Observations: 250 Number of Groups: 25 ######################### However, it I dont make any other changes, but simply to switch the order of the simulation variance: sigmahA <- 3 sigmahB <- 1 , generate the data and apply the model, the model gives very confusing result: ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 261.763 279.3301 -125.8815 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 3.187986 3.295383 0.1959864 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.1455012 0.6378382 224 0.2281162 0.8198 progB 1.0590270 0.9171802 224 1.1546553 0.2495 Correlation: progA progB 0.695 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.30847626 -0.63592816 0.03830767 0.64349268 2.63737406 Number of Observations: 250 Number of Groups: 25 ######################### The model output gives equal variance. The same situation happens if I run the same code in the article (link above) and also simply change the order of the variance assigned to the groups. Can someone help me to figure out why the order of assigning the variance matters? And why the second case does not gives the expected results? Thanks Regards, Chun Chen [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku 20014 Finland Email: paul.debes at utu.fi
Thank you Paul for the help. It's now working as I expected. and for lmer(), I guess the correct term should be (0 + prog | groupID)? Regards, Chun ________________________________ ??????: Paul Debes <paul.debes at utu.fi> ????????: 2016??7??28?? 7:22 ??????: r-sig-mixed-models at r-project.org ????: Chen Chun ????: Re: [R-sig-ME] simulation of factor-specific random effect variance estimation, the order of assigned variance matters? Hi Chen, I think you may only have to change the random effect specification from variance contrast to variance means: random = list(groupID = pdDiag(~prog)) to random = list(groupID = pdDiag(~0+prog)) It may also be good to change "groupID" from a continuous covariate to a factor. This should give the expected standard deviations for the random-term groups in both simulations. Just as a note, please be aware that you do not specify variance but standard deviation in your simulations (model output is also in standard deviation). Best, Paul On Wed, 27 Jul 2016 23:48:01 +0300, Chen Chun <talischen at hotmail.com> wrote:
Dear all, I have a nested data structure and I would like to estimate factor-specific random effect variance. Inspired by the article: http://rstudio-pubs-static.s3.amazonaws.com/6B98_c0ae95A0AAA44A3Aa8a9b6ba9703fcf5.html<http://rstudio-pubs-static.s3.amazonaws.com/6298_c0ae951011144131a8a9b6ba9703fcf5.html> I conducted some simulations: ############### ##--generate data----- ngroup <- 25 nrep<- 5 ##----random effect of group--- meanh <- 0 sigmahA <- 1 ##--these are the two assigned variance of the random factor group for each class (A, B) sigmahB <- 3 ##----residual random error--- meanR <- 0 sigmaR <- 0.2 ##---simulate smaple-------------------------------------------------- set.seed(55) out_A <- NA out_B <- NA raneff_groupA <- rnorm(ngroup, meanh, sigmahA) raneff_groupB <- rnorm(ngroup, meanh, sigmahB) groupID <- seq(1, ngroup) for (i in 1:ngroup) { raneff_red_A <- rnorm(nrep, meanR, sigmaR) temp_A <- 0 + raneff_groupA[i] + raneff_red_A raneff_red_B <- rnorm(nrep, meanR, sigmaR) temp_B <- 1 + raneff_groupB[i] + raneff_red_B out_A <- rbind(out_A, cbind(groupID[i], raneff_groupA[i], raneff_red_A, temp_A)) out_B <- rbind(out_B, cbind(groupID[i], raneff_groupB[i], raneff_red_B, temp_B)) } out_A <- out_A[-1,] out_B <- out_B[-1,] dat <- data.frame(groupID=c(out_A[,1], out_B[,1]), prog=c(rep("A",nrow(out_A)), rep("B",nrow(out_B))), out=c(out_A[,"temp_A"], out_B[,"temp_B"]), raneff_haul=c(out_A[,2], out_B[,2]), error=c(out_A[,3], out_B[,3])) library(lattice) bwplot(out~paste(prog,groupID), data=dat) ##--apply model----- fit2 <- lme(out ~ prog-1, random = list(groupID = pdDiag(~prog)), method = "REML", data = dat) summary(fit2) This code gives the expected model output: as estimated variance of groups for A and B are: 1.07957 2.835139, similar to assigned value ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 202.7263 220.2934 -96.36314 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 1.07957 2.835139 0.1959905 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.0316535 0.2166244 224 0.1461215 0.8840 progB 1.1446339 0.6069982 224 1.8857287 0.0606 Correlation: progA progB 0.355 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.29868033 -0.63163428 0.03981236 0.64863494 2.63613133 Number of Observations: 250 Number of Groups: 25 ######################### However, it I dont make any other changes, but simply to switch the order of the simulation variance: sigmahA <- 3 sigmahB <- 1 , generate the data and apply the model, the model gives very confusing result: ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 261.763 279.3301 -125.8815 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 3.187986 3.295383 0.1959864 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.1455012 0.6378382 224 0.2281162 0.8198 progB 1.0590270 0.9171802 224 1.1546553 0.2495 Correlation: progA progB 0.695 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.30847626 -0.63592816 0.03830767 0.64349268 2.63737406 Number of Observations: 250 Number of Groups: 25 ######################### The model output gives equal variance. The same situation happens if I run the same code in the article (link above) and also simply change the order of the variance assigned to the groups. Can someone help me to figure out why the order of assigning the variance matters? And why the second case does not gives the expected results? Thanks Regards, Chun Chen [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku 20014 Finland Email: paul.debes at utu.fi
Hi Chun (sorry for previously confusing your first and last name), The lmer term you suggest fits an unstructured covariance matrix (i.e., also the covariance for the "groupID " random effects between "prog" levels), whereas the lme term is fitting a diagonal covariance matrix (where the covariance is restrained to be zero). The latter is more parsimonious (one less parameter) and corresponds to what you simulated - no covariance. I hope someone else can help you how to specify the same diagonal covariance matrix for lmer to have equivalent models between packages. Unfortunately, I don't know how that works using lmer. Best, Paul On Thu, 28 Jul 2016 12:44:09 +0300, Chen Chun <talischen at hotmail.com> wrote:
Thank you Paul for the help. It's now working as I expected. and for lmer(), I guess the correct term should be (0 + prog | groupID)? Regards, Chun ???: Paul Debes <paul.debes at utu.fi> ????: 2016?7?28? 7:22 ???: r-sig-mixed-models at r-project.org ??: Chen Chun ??: Re: [R-sig-ME] simulation of factor-specific random effect variance estimation, the order of assigned variance matters?
Hi Chen,
I think you may only have to change the random effect specification from variance contrast to variance means: random = list(groupID = pdDiag(~prog)) to random = list(groupID = pdDiag(~0+prog)) It may also be good to change "groupID" from a continuous covariate to a factor. This should give the expected standard deviations for the random-term groups in both simulations. Just as a note, please be aware that you do not specify variance but standard deviation in your simulations (model output is also in standard deviation). Best, Paul On Wed, 27 Jul 2016 23:48:01 +0300, Chen Chun <talischen at hotmail.com> wrote:
Dear all, I have a nested data structure and I would like to estimate factor-specific random effect variance. Inspired by the article: http://rstudio-pubs-static.s3.amazonaws.com/6B98_c0ae95A0AAA44A3Aa8a9b6ba9703fcf5.html<http://rstudio-pubs->static.s3.amazonaws.com/6298_c0ae951011144131a8a9b6ba9703fcf5.html> I conducted some simulations: ############### ##--generate data----- ngroup <- 25 nrep<- 5 ##----random effect of group--- meanh <- 0 sigmahA <- 1 ##--these are the two assigned variance of the random factor group for each class (A, B) sigmahB <- 3 ##----residual random error--- meanR <- 0 sigmaR <- 0.2 ##---simulate smaple-------------------------------------------------- set.seed(55) out_A <- NA out_B <- NA raneff_groupA <- rnorm(ngroup, meanh, sigmahA) raneff_groupB <- rnorm(ngroup, meanh, sigmahB) groupID <- seq(1, ngroup) for (i in 1:ngroup) { raneff_red_A <- rnorm(nrep, meanR, sigmaR) temp_A <- 0 + raneff_groupA[i] + raneff_red_A raneff_red_B <- rnorm(nrep, meanR, sigmaR) temp_B <- 1 + raneff_groupB[i] + raneff_red_B out_A <- rbind(out_A, cbind(groupID[i], raneff_groupA[i], raneff_red_A, temp_A)) out_B <- rbind(out_B, cbind(groupID[i], raneff_groupB[i], raneff_red_B, temp_B)) } out_A <- out_A[-1,] out_B <- out_B[-1,] dat <- data.frame(groupID=c(out_A[,1], out_B[,1]), prog=c(rep("A",nrow(out_A)), rep("B",nrow(out_B))), out=c(out_A[,"temp_A"], out_B[,"temp_B"]), raneff_haul=c(out_A[,2], out_B[,2]), error=c(out_A[,3], out_B[,3])) library(lattice) bwplot(out~paste(prog,groupID), data=dat) ##--apply model----- fit2 <- lme(out ~ prog-1, random = list(groupID = pdDiag(~prog)), method = "REML", data = dat) summary(fit2) This code gives the expected model output: as estimated variance of groups for A and B are: 1.07957 2.835139, similar to assigned value ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 202.7263 220.2934 -96.36314 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 1.07957 2.835139 0.1959905 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.0316535 0.2166244 224 0.1461215 0.8840 progB 1.1446339 0.6069982 224 1.8857287 0.0606 Correlation: progA progB 0.355 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.29868033 -0.63163428 0.03981236 0.64863494 2.63613133 Number of Observations: 250 Number of Groups: 25 ######################### However, it I dont make any other changes, but simply to switch the order of the simulation variance: sigmahA <- 3 sigmahB <- 1 , generate the data and apply the model, the model gives very confusing result: ############### ############### Model output: Linear mixed-effects model fit by REML Data: dat AIC BIC logLik 261.763 279.3301 -125.8815 Random effects: Formula: ~prog | groupID Structure: Diagonal (Intercept) progB Residual StdDev: 3.187986 3.295383 0.1959864 Fixed effects: out ~ prog - 1 Value Std.Error DF t-value p-value progA 0.1455012 0.6378382 224 0.2281162 0.8198 progB 1.0590270 0.9171802 224 1.1546553 0.2495 Correlation: progA progB 0.695 Standardized Within-Group Residuals: Min Q1 Med Q3 Max -2.30847626 -0.63592816 0.03830767 0.64349268 2.63737406 Number of Observations: 250 Number of Groups: 25 ######################### The model output gives equal variance. The same situation happens if I run the same code in the article (link above) and also simply change the order of the variance assigned to the groups. Can someone help me to figure out why the order of assigning the variance matters? And why the second case does not gives the expected results? Thanks Regards, Chun Chen [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
--Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku 20014 Finland Email: paul.debes at utu.fi
Paul V. Debes DFG Research Fellow Division of Genetics and Physiology Department of Biology University of Turku 20014 Finland Email: paul.debes at utu.fi