EM and Missing Data in R
Kyle: I realize in the HLM circles it is common to use the term "levels", but this is really quite confusing and, in fact, misleading. In a multilevel model, there are multiple levels of random variation (many variance components) but there are not multiple levels of fixed effects. These are linear models with additive fixed and random effects. That is, there are random effects and everything else is just a covariate---there are no levels associated with covariates. So, now let's consider your question. The matrix notation of the model is Y = XB + Zu + e where X is a known model matrix, B are the coefficients of the fixed effects, Z is also a model matrix and u are the random effects. Now, u is completely missing (as is B). If they weren't missing, the problem of solving for B would be easy. That is, if we had the complete data, the maximization problem is simple. But, this is a missing data problem and so some process is necessary to help us along. That is what EM does. It can be used to augment the missing data in the vector u to form a complete data problem and subsequently then perform the maximization w.r.t B. So yes, EM is a useful tool for missing data problems. EM, I think, is easily programable in R. But, because EM is a general algorithm, I think the best path for you is to go to the Dempster et al paper to understand how it works and how it can be applied to missing data problems. Then, you need to consider how it will work with your specific problem and work out the conditional expectations and then maximize (which is often the easiest part). HTH, Harold
-----Original Message----- From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Roberts, Kyle Sent: Wednesday, January 21, 2009 10:56 AM To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] EM and Missing Data in R Friends, Do you know of any way to use the EM algorithm to do imputation for missing data at the second level in R? I have heard some things about this at conferences, but can't put my fingers on the actual references. I have a student who is looking at missing data treatments for level-2 variables. I haven't done any research in this area, but I want to point her in the right direction. If this is an "nonsensical"-type question, please forgive my naivety! Thanks for your instruction. Blessings, Kyle ********************************************************* Dr. J. Kyle Roberts Department of Teaching and Learning Annette Caldwell Simmons School of Education and Human Development Southern Methodist University P.O. Box 750381 Dallas, TX 75275 214-768-4494 http://www.hlm-online.com/ ********************************************************* [[alternative HTML version deleted]]
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models