Skip to content
Prev 13007 / 20628 Next

Best way to handle missing data?

Thank you for this clarification.  I can see from studying the article
linked below more closely that it confirms what you have said.
http://www.statisticalhorizons.com/wp-content/uploads/MissingDataByML.pdf

The distinction seems to be between missing data in the dependent variable
(which SAS PROC MIXED handles automatically) versus missing data in a
predictor variable (which would require switching to a structural equation
modeling program, such as SAS PROC CALIS to handle automatically using
FIML).  Here is a quote from the conclusion of the article that explains
this:

"When estimating mixed models for repeated measurements, PROC MIXED and
PROC GLIMMIX automatically handle missing data by maximum likelihood, as
long as there are no missing data on predictor variables. When data are
missing on both predictor and dependent variables, PROC CALIS can do
maximum likelihood for a large class of linear models..."

This sounds approximately equivalent to the functionality available in R.

I don't think the model I am working on is a good candidate for structural
equation modeling because the data set is very unbalanced (ie. there are
very different numbers of observations for different people, taken at
different times), the main relationship of interest involves a time-varying
predictor, and one of the variables with missing data is not continuous (it
is a binary, categorical variable).  So, I will stick with the multiple
imputation approach for handling the missing data.

Bonnie


On Fri, Feb 27, 2015 at 4:22 PM, Viechtbauer Wolfgang (STAT) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote: