sanity-checking plans for glmer
Mitchell Maltenfort wrote:
Having briefly fallen for the notion that the negative.binomial family in MASS could be used in glmer, I want to use these lists for a sanity check on my final (?) plans. I want to use glmer for logistic regression and for poisson regression on a data set of 10,000 items. There will be two crossed random effects. For the logistic regression, I want odds ratios with confidence intervals.For the poisson regression, I'd want multiplying factors, again with confidence intervals. Either way, my co-authors will want p-values. I'd also like to examine whether the two random effects have comparable variation, and whether any of them are provide odds ratios or multipliers that are are statistically distinct from 1.0 . Can I use the s.e's and p-values (from z-statistics) provided by glmer? Or should I use the pvals.fnc function from the language R package? Are there any holes in my plans I'm not seeing?
[cross-posting to multiple R lists is discouraged.] How difficult this is, practically, depends on how many random-effects levels you have (in the random effect with the minimum number of random-effect levels). If this number is large (>50? Angrist and Pischke give "42" as the rule of thumb in their Douglas-Adams-themed econometrics book ...), then you can get away with Wald-Z or likelihood ratio inferences without worrying about adjusting for finite-sample effects (i.e., use anova() or the s.e.'s and p-values provided by glmer). For the random effects, use anova() [or possibly the RLRsim package, although I don't know if it will work in this case]. Confidence intervals on the random effects are harder; see the "standard errors of variance estimates" section in <http://glmm.wikidot.com/faq>. (It says there that you can use lme4a, the bleeding-edge R-forge version of lme4, for profiles, but as of a couple of days ago profiles weren't available for GLMMs ...) pvals.fnc from the languageR function hasn't worked for a while, as far as I know, because it uses the mcmcsamp() function which has been disabled for some months (due to difficulty getting the sampling algorithm to mix reliably). MCMCglmm would be a reasonable alternative -- I don't know how slow it would be with 10,000 items, I have used it successfully for a problem with 1600, so it should definitely work given that you're willing to be patient. good luck, Ben Bolker