An embedded and charset-unspecified text was scrubbed... Name: not available URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20140401/e116c9cb/attachment.pl>
lme4 with simulated hierarchical data with correlated errors
3 messages · Raluca Gui, Ben Bolker, Markus Jäntti
This is not trivial. The authors use a completely different estimation approach (and describe, but do not give code for, their procedure). In Appendix 2 the authors describe their method for random-effects estimation, which uses a "feasible GLS estimator" from Verbeek 2000 "A guide to modern econometrics": "For more details on the computation of the weighting matrix, see Verbeek (2000), Hsiao (1986) and Baltagi (2001). Several other random-effects estimation procedures for model (1) are available that include the iterative GLS (IGLS) approach, (restricted) maximum likelihood (REML), or Bayesian procedures (see e.g. Goldstein, 1995; Longford, 1993)." It would take considerable work (at least on my part! maybe it's easy/already known for someone else on the list) to work through and understand the characteristics of these different estimation methods. Maybe it's a good thing that REML as implemented by lme4 is less biased? Ben Bolker
On 14-04-01 07:05 AM, Raluca Gui wrote:
Hello,
I simulate a multilevel model, with 2 levels, of the form:
y_ij=beta_0+beta_1*x_ij+alfa_i+eta_ij
Individual-level units: i=1,...,150
Group-level units: j=1,...,10
The error therms are assumed to follow N(0,1)
I want to compute the magnitude of the bias of the estimators when there is
correlation between x_ij and alfa_i and between x_ij and eta_ij.
I assign the following true values for the beta parameters: beta_0=10,
beta_1=2. I run 250 replications.
I study 2 cases, depending on the magnitude of the correlation:
i) corr(x_ij, alfa_i) = 0 , corr(x_ij, eta_ij) = 0.3
ii) corr(x_ij, alfa_i) = 0.3 , corr(x_ij, eta_ij) = 0.3
What I am doing is actually a replications of a paper by Ebbes et al.
(2004). The problem encountered is that I do not get the same magnitude of
the bias as the authors. I get considerably smaller bias.
Below are the results of the authors across 250 replications (means and
std. deviations):
Case
i ii
Fixed Effects beta_0 - -
beta_1 2.43 (0.04) 2.42 (0.04)
Random Effects beta_0 7.88 (0.20) 5.79 (0.30)
beta_1 2.42 (0.04) 2.84 (0.06)
sigma_alfa 0.99 (0.13) 0.00 (0.00)
sigma_eta 0.90 (0.03) 0.91 (0.04)
My code:
require(lme4)
require(MASS)
k <- 10 # total number of firms
n <- 15 # number of employees within each firm
j <- k*n # total number of employees
beta0=10
beta1=2
rho1=0.0001 # this is Case iii
rho2=0.3
beta_zero <- rep(NA,250)
beta_one <- rep(NA, 250)
for (i in 1:250) {
c <- cbind(FID = sort(rep(1:k,n)), Fint =rnorm(j,0,1),Z = rnorm(j,0,1))
#generate level 2 data
alfa_i <- rnorm(j, mean=0, 1)
eta_ij <- rnorm(j, mean=0, 1)
b <- cbind(EID = 1:j, Eint=rnorm(j,0,1), X = rnorm(j,mean =0, sd
=1)+eta_ij+alfa_i) # generate level 1 data
m1dat <- data.frame(c,b)
m1dat <- within(m1dat, {
EID <- factor(EID)
FID <- factor(FID)}
)
outcome <- data.frame(y_ij =
beta0+beta1*(m1dat[,'X'])+rho1*alfa_i+rho2*eta_ij )
m1dat <- cbind(m1dat,outcome)
rand_eff <- lmer(y_ij ~1+X+(1|FID), data=m1dat)
beta_zero[i] <- fixed.effects(rand_eff)[[1]]
beta_one[i] <- fixed.effects(rand_eff)[[2]]
}
means <- c(mean(beta_zero),mean(beta_one))
Thanks in advance
You probably want to rerun your simulation using the nlme::gls. Econometricians tend to use gls to estimate what they call random effects models rather than REML. Markus Jantti
On 01/04/14 16:13, Ben Bolker wrote:
This is not trivial. The authors use a completely different estimation approach (and describe, but do not give code for, their procedure). In Appendix 2 the authors describe their method for random-effects estimation, which uses a "feasible GLS estimator" from Verbeek 2000 "A guide to modern econometrics": "For more details on the computation of the weighting matrix, see Verbeek (2000), Hsiao (1986) and Baltagi (2001). Several other random-effects estimation procedures for model (1) are available that include the iterative GLS (IGLS) approach, (restricted) maximum likelihood (REML), or Bayesian procedures (see e.g. Goldstein, 1995; Longford, 1993)." It would take considerable work (at least on my part! maybe it's easy/already known for someone else on the list) to work through and understand the characteristics of these different estimation methods. Maybe it's a good thing that REML as implemented by lme4 is less biased? Ben Bolker On 14-04-01 07:05 AM, Raluca Gui wrote:
Hello,
I simulate a multilevel model, with 2 levels, of the form:
y_ij=beta_0+beta_1*x_ij+alfa_i+eta_ij
Individual-level units: i=1,...,150
Group-level units: j=1,...,10
The error therms are assumed to follow N(0,1)
I want to compute the magnitude of the bias of the estimators when there is
correlation between x_ij and alfa_i and between x_ij and eta_ij.
I assign the following true values for the beta parameters: beta_0=10,
beta_1=2. I run 250 replications.
I study 2 cases, depending on the magnitude of the correlation:
i) corr(x_ij, alfa_i) = 0 , corr(x_ij, eta_ij) = 0.3
ii) corr(x_ij, alfa_i) = 0.3 , corr(x_ij, eta_ij) = 0.3
What I am doing is actually a replications of a paper by Ebbes et al.
(2004). The problem encountered is that I do not get the same magnitude of
the bias as the authors. I get considerably smaller bias.
Below are the results of the authors across 250 replications (means and
std. deviations):
Case
i ii
Fixed Effects beta_0 - -
beta_1 2.43 (0.04) 2.42 (0.04)
Random Effects beta_0 7.88 (0.20) 5.79 (0.30)
beta_1 2.42 (0.04) 2.84 (0.06)
sigma_alfa 0.99 (0.13) 0.00 (0.00)
sigma_eta 0.90 (0.03) 0.91 (0.04)
My code:
require(lme4)
require(MASS)
k <- 10 # total number of firms
n <- 15 # number of employees within each firm
j <- k*n # total number of employees
beta0=10
beta1=2
rho1=0.0001 # this is Case iii
rho2=0.3
beta_zero <- rep(NA,250)
beta_one <- rep(NA, 250)
for (i in 1:250) {
c <- cbind(FID = sort(rep(1:k,n)), Fint =rnorm(j,0,1),Z = rnorm(j,0,1))
#generate level 2 data
alfa_i <- rnorm(j, mean=0, 1)
eta_ij <- rnorm(j, mean=0, 1)
b <- cbind(EID = 1:j, Eint=rnorm(j,0,1), X = rnorm(j,mean =0, sd
=1)+eta_ij+alfa_i) # generate level 1 data
m1dat <- data.frame(c,b)
m1dat <- within(m1dat, {
EID <- factor(EID)
FID <- factor(FID)}
)
outcome <- data.frame(y_ij =
beta0+beta1*(m1dat[,'X'])+rho1*alfa_i+rho2*eta_ij )
m1dat <- cbind(m1dat,outcome)
rand_eff <- lmer(y_ij ~1+X+(1|FID), data=m1dat)
beta_zero[i] <- fixed.effects(rand_eff)[[1]]
beta_one[i] <- fixed.effects(rand_eff)[[2]]
}
means <- c(mean(beta_zero),mean(beta_one))
Thanks in advance
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Markus Jantti Professor of Economics Swedish Institute for Social Research, Stockholm University