Comparison of lme4, geepack for binary correlated variables
On: Chen et al Genet Epidemiol 2011, 35:650-7. The latest (dead tree) issue of Genetic Epidemiology has a paper using simulated and real data to compare methods for testing association between a measured genotype (fixed effect) and a dichotomous outcome in pedigrees, so there is residual correlation between observations. They use a) "ordinary" gaussian linear mixed model treating the trait as 0-1 (in lmekin) b) the binomial-gaussian GLMM using glmer (0.999375-32) c) GEE in geepack. Simulated data were produced under a threshold model and AFAICT [I don't think the paper well-written], a Wald test was used to assess the fixed effect for all three. You can read the abstract, at least, online: they prefer GEE. Their GLMM test Type-1 error tends to drift up a little as the trait prevalence increases. They also experienced problems with GLMM when carrying small sample simulations. They did encounter numerical problems with GEE when the trair prevalence was low, but for this situation they preferred the gaussian LMM, as they found this to have OK Type-I error rates, and better power than the GLMM (though twice as slow ;)). The main weakness of course if that they did not report LRTS results, although they do mention Hauck-Donner effects as a possible cause of their problems. Another possible one is the generating model, which is convenient but different from the logistic-gaussian. And fitting the LMM to binary variables does usually give correct Type I errors, but when a true effect is present overestimates the evidence for association in my experience. Cheers, David Duffy.
| David Duffy (MBBS PhD) ,-_|\ | email: davidD at qimr.edu.au ph: INT+61+7+3362-0217 fax: -0101 / * | Epidemiology Unit, Queensland Institute of Medical Research \_,-._/ | 300 Herston Rd, Brisbane, Queensland 4029, Australia GPG 4D0B994A v