lme4 with simulated hierarchical data with correlated errors

An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20140401/e116c9cb/attachment.pl>
This is not trivial.

  The authors use a completely different estimation approach (and
describe, but do not give code for, their procedure). In Appendix 2 the
authors describe their method for random-effects estimation, which uses
a "feasible GLS estimator" from Verbeek 2000 "A guide to modern
econometrics": "For more details on the computation of the weighting
matrix, see Verbeek (2000), Hsiao (1986) and Baltagi (2001). Several
other random-effects estimation procedures for model (1) are available
that include the iterative GLS (IGLS) approach, (restricted) maximum
likelihood (REML), or Bayesian procedures (see e.g. Goldstein, 1995;
Longford, 1993)."

  It would take considerable work (at least on my part! maybe it's
easy/already known for someone else on the list) to work through and
understand the characteristics of these different estimation methods.

  Maybe it's a good thing that REML as implemented by lme4 is less biased?

  Ben Bolker
Hello,

I simulate a multilevel model, with 2 levels,  of the form:

y_ij=beta_0+beta_1*x_ij+alfa_i+eta_ij

Individual-level units: i=1,...,150
Group-level units: j=1,...,10
The error therms are assumed to follow N(0,1)

I want to compute the magnitude of the bias of the estimators when there is
correlation between x_ij and alfa_i and between x_ij and eta_ij.

I assign the following true values for the beta parameters: beta_0=10,
beta_1=2. I run 250 replications.

I study 2 cases, depending on the magnitude of the correlation:

i)  corr(x_ij, alfa_i) = 0 , corr(x_ij, eta_ij) = 0.3
ii)  corr(x_ij, alfa_i) = 0.3 , corr(x_ij, eta_ij) = 0.3

What I am doing is actually a replications of a paper by Ebbes et al.
(2004). The problem encountered is that I do not get the same magnitude of
the bias as the authors. I get considerably smaller bias.

Below are the results of the authors across 250 replications (means and
std. deviations):

                                                     Case

i                       ii

Fixed Effects          beta_0             -                       -
                               beta_1       2.43 (0.04)          2.42 (0.04)

Random Effects     beta_0        7.88 (0.20)          5.79 (0.30)
                               beta_1       2.42 (0.04)          2.84 (0.06)
                              sigma_alfa  0.99 (0.13)          0.00 (0.00)
                              sigma_eta   0.90 (0.03)         0.91 (0.04)

My code:

require(lme4)
require(MASS)
k <- 10            # total number of firms
n <- 15            # number of employees within each firm
j <- k*n        # total number of employees
beta0=10
beta1=2
rho1=0.0001     # this is Case iii
rho2=0.3
beta_zero <- rep(NA,250)
beta_one <- rep(NA, 250)
for (i in 1:250) {
c <- cbind(FID = sort(rep(1:k,n)), Fint =rnorm(j,0,1),Z = rnorm(j,0,1))
            #generate level 2 data
alfa_i <- rnorm(j, mean=0, 1)
eta_ij <- rnorm(j, mean=0, 1)
b <- cbind(EID = 1:j, Eint=rnorm(j,0,1), X = rnorm(j,mean =0, sd
=1)+eta_ij+alfa_i)  # generate level 1 data
m1dat <- data.frame(c,b)
m1dat <- within(m1dat, {
   EID <- factor(EID)
   FID <- factor(FID)}
   )
outcome <- data.frame(y_ij =
beta0+beta1*(m1dat[,'X'])+rho1*alfa_i+rho2*eta_ij )
m1dat <- cbind(m1dat,outcome)
rand_eff <- lmer(y_ij ~1+X+(1|FID), data=m1dat)
beta_zero[i] <- fixed.effects(rand_eff)[[1]]
beta_one[i] <- fixed.effects(rand_eff)[[2]]
}
 means <- c(mean(beta_zero),mean(beta_one))

Thanks in advance

You probably want to rerun your simulation using the nlme::gls. Econometricians 
tend to use gls to estimate what they call random effects models rather than REML.

Markus Jantti
   This is not trivial.

   The authors use a completely different estimation approach (and
describe, but do not give code for, their procedure). In Appendix 2 the
authors describe their method for random-effects estimation, which uses
a "feasible GLS estimator" from Verbeek 2000 "A guide to modern
econometrics": "For more details on the computation of the weighting
matrix, see Verbeek (2000), Hsiao (1986) and Baltagi (2001). Several
other random-effects estimation procedures for model (1) are available
that include the iterative GLS (IGLS) approach, (restricted) maximum
likelihood (REML), or Bayesian procedures (see e.g. Goldstein, 1995;
Longford, 1993)."