EM and Missing Data in R

Kyle:

I realize in the HLM circles it is common to use the term "levels", but
this is really quite confusing and, in fact, misleading. In a multilevel
model, there are multiple levels of random variation (many variance
components) but there are not multiple levels of fixed effects. These
are linear models with additive fixed and random effects. That is, there
are random effects and everything else is just a covariate---there are
no levels associated with covariates. So, now let's consider your
question. 

The matrix notation of the model is Y = XB + Zu + e where X is a known
model matrix, B are the coefficients of the fixed effects, Z is also a
model matrix and u are the random effects.

Now, u is completely missing (as is B). If they weren't missing, the
problem of solving for B would be easy. That is, if we had the complete
data, the maximization problem is simple. But, this is a missing data
problem and so some process is necessary to help us along. That is what
EM does. It can be used to augment the missing data in the vector u to
form a complete data problem and subsequently then perform the
maximization w.r.t B.

So yes, EM is a useful tool for missing data problems. EM, I think, is
easily programable in R. But, because EM is a general algorithm, I think
the best path for you is to go to the Dempster et al paper to understand
how it works and how it can be applied to missing data problems. Then,
you need to consider how it will work with your specific problem and
work out the conditional expectations and then maximize (which is often
the easiest part).

HTH,
Harold

EM and Missing Data in R

Thread (2 messages)