Anova II table, df, drop1 and very complex regression models!
True, we don't know how much missing data there is in this particular case, and of course analyzing multiple imputed datasets will take more computational time than analyzing a single dataset (either complete or incomplete). My point was only that using listwise deletion to reduce an incomplete dataset to a complete one by discarding observations with missing data on the variables involved in the analyses could be changing the nature of the population to which one can legitimately generalize the results. If there is a trivial amount of missing data, that may not matter much because it won't change the answer much (negligible bias). But with larger amounts of missing data that difference may well be very important because now you're likely working from a more biased sample (or we can describe it as a sample that represents a different, more narrowly defined population). Drawing conclusions about the original target population from a biased sample is poor statistical and scientific practice. The number of people who regularly do so despite the fact that solutions to such a problem exist concerns me. I just wanted the original poster to consider the issues involved and assess the amount of missing data before blindly using listwise deletion. Steven J. Pierce, Ph.D. Associate Director Center for Statistical Training & Consulting (CSTAT) Michigan State University -----Original Message----- From: David Duffy [mailto:David.Duffy at qimrberghofer.edu.au] Sent: Wednesday, August 31, 2016 11:45 PM To: r-sig-mixed-models at r-project.org Subject: Re: [R-sig-ME] Anova II table, df, drop1 and very complex regression models!
Steven J. Pierce [pierces1 at msu.edu] wrote:
I'd probably use Mplus (www.statmodel.com) for that, [...]
[snip]
Fortunately, the lavaan package in R replicates some of what Mplus can do.
Every couple of years, I mention OpenMx on this list: http://openmx.psyc.virginia.edu/ https://cran.r-project.org/web/packages/OpenMx/index.html I don't know exactly how much of Mplus's functionality it provides, but suspect it would be close to 100%. As to the OP's question, we don't know how much data is actually missing, or what pattern that takes. We also don't know why some of the fitted models differ by 0 d.f. If we did impute the data 5 times, wouldn't that give us an 80 h run-time? Cheers, David.