extracting p values for main effects of binomial glmm
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 15-03-05 03:08 PM, John Maindonald wrote:
An issue that has not been raised so far is whether the binomial or other GLM type variance takes account of all relevant observation level sources of variation. This is pertinent for all generalised linear mixed models. Over-dispersed binomial or Poisson is in my experience much more common (certainly, e.g. in such areas as ecology) than binomial or Poisson. The effect on the variance can be huge, multiplying by a factor of 3 or 4 or more. More generally, the multiplier may not be a constant factor. This issue is most serious for comparisons where the relevant variances are the observation level variances. With glmer(), one way to handle this is to fit an observation level random effect; the multiplier is not then a constant factor. glmer() did at one time allow a constant multiplier. I think it unfortunate that this was removed, as it restricted the options available.
It was removed because we (Doug Bates in particular) decided that we really didn't understand at a theoretical level what the quasi- model was doing, and that it sometimes seemed to be producing unreasonable outputs. That doesn't mean it *can't* be understood at a theoretical level -- in principle, it should act similarly to other conditional response distributions with free scale parameters, which we do allow (e.g. Gamma) -- but this is still on our very long list of things to try to figure out.
Using negative binomial errors is in principle another possibility, but such models can be difficult to get to converge. If the model is wrong, it may well not converge, which is an obstacle to getting to the point where one has something half-sensible that does converge ? this is the case for glm.nb() as well as for glmer.nb().
For what it's worth glmmADMB (which operates in a general optimization framework, not in a GLM-like IRLS mode) allows both NB2 (negative binomial parameterized in the traditional V=mu*(1+mu/k) way) and NB1 (negative binomial parameterized so that V=phi*mu, as in the quasi-Poisson case)
With predicted probabilities that are close to 0 or 1, or Poisson means that are close to 0, the Hauck-Donner effect where the standard errors for Wald statistics become nonsense and z-statistics become smaller as the distance between means that are to be compared increases, is something more to worry about.
Well in this case, just use profile confidence intervals or LRTs. At least it's usually pretty obvious when this is happening.
I guess the message is that this is treacherous territory, and it really is necessary to know what one is doing!
Yes; we have all of the complexities of linear models, GLMs (which seem to be the issues you are pointing out here), and mixed models on top of that ...
I?d like to echo Jonathon Baron?s comments on the ?main effects in the presence of interactions issue." John Maindonald email: john.maindonald at anu.edu.au Wellington, NZ On 6/03/2015, at 0:25, Henrik Singmann <henrik.singmann at psychologie.uni-freiburg.de> wrote:
Hi, As others have pointed out, the issue of obtaining p-values for effects is not trivial and there are pitfalls. Nevertheless I think that the easiest way to obtain what you want is to use mixed(..., method = "LRT") from package afex (full disclaimer: I am the main author of said package and function). mixed() takes care of some issues, such as using appropriate coding and how to test main effects in the presence of interactions, in not uncontroversial but sensible ways (e.g., it does the latter by simply removing the main effect in a brute-force manner from the model-matrix and compare the full with the so reduced model). The way things are handled in mixed might not be correct in all cases but may provide exactly the combination of ease of use with sensible defaults that may be helpful in your case. In other words, it tries to adopt an approach often taken by commercial statistics packages. You simply call mixed instead of glmer which fits and refits various versions of the model to provide likelihood-ratio tests for all effects as you want them. Note however, that mixed() cannot deal with the way you specify the model using cbind() as dependent variable. What you need to do is compute the proportions of "successes" and pass the number of observations as weights. Assuming your data resides in a data.frame df this would result in something along the following lines: df$prop_weights <- df$Offspring + df$Eggs_Offspring df$proportion_hatched <- df$Offspring/df$prop_weights m1 <- mixed(proportion_hatched ~ Diet * Infection_status * Day + (1|SubjectID) + (1|Day), data = df, family=binomial, weights = df$proportion_weights, method = "LRT") However, as Ken Beath has said, the main effects may not be sensible given interactions. To asses the degree of this threat for your data (which very much much depends on the type of interaction and their size) I suggest to follow the excellent advice of fortune(193) to "take the unusual step of plotting the data." This can be done nicely with e.g., the effects package, even for complicated interactions and binomial mixed models. Hope this helps, Henrik Am 04.03.2015 um 21:11 schrieb Megan Kutzer:
Hi, I'm fairly new to mixed models and have done a lot of reading without much success. Unfortunately there is no one at my institution who is really familiar with them so I thought I would try this list. I'm running a binomial generalized linear mixed effects model and I need p-values for the main effects. I know this isn't entirely correct with this type of model but my supervisor wants the p-values! The model is: glmer (Proportion hatched ~ Diet * Infection status * Day + (1|SubjectID) + (1|Day), family=binomial) where, Proportion hatched = cbind(Offspring, Eggs-Offspring) Diet is a factor with 2 levels Infection status is a factor with 4 levels Day is a factor with 3 levels Using Subject ID number and Day as random effects is supposed to control for pseudoreplication in the model, although I am not entirely sure that this is specified in the correct way. I wanted to include experimental replicate here too but the model failed to converge. My question is: is there a way to get p-values for the main fixed effects of Diet, Infection and Day? If you need more specific model information or the model output I would be happy to provide it. Thanks, Megan [[alternative HTML version deleted]]
-- Dr. Henrik Singmann Universit?t Z?rich, Schweiz http://singmann.org
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAEBAgAGBQJU+LzHAAoJEOCV5YRblxUHhoEH/3/vP0pnOgluZ9gw77eh3p// UOPuJgv1iYViJMatg1G5Aq4LxXDduTPPDqrOgEFji+/1D/7x5DH7cYb6e6AVeZbA jNPOVXPX66ER/8NDF5l2VT1To+anMtRZ2aKNusDS2ntdx1Xofqlzk1WShNlEstMM fhKTGf2s1IKdZwFrXcVdb5u9a9SFYbFisy/U3lx5LmNEJn1wnyg23e709PFjSZMv N4/BeaN8fUUYYRSS9XzGV0Q3ZKkVqE8z0idPto5LaQIMmDc04ny9DHPbg6iG4J/y i5zCE/j/DnhI7N/LVgDuQQvlYWURpWHcvDqncj4a6jXBnKfwRIG0hzFTr4I2KWg= =G8uO -----END PGP SIGNATURE-----