Comparing results from glmer and glht
Thierry,
Thank you for the suggestion. I think I understand, though creating
user defined contrasts will be difficult.
I create a combined variable:
wshrip <-c bind(wsh, rip)
Resulting in a new model
modelEPT<-glmer(EPT ~ wshrip + (1|stream) + (1|stream:rip) + (1|obs),
data=ept, family=binomial(link="logit"))
The user defined contrasts I tried previously were:
wsh <- rbind("C vs. D" = c(1,0,0,0,0,0,0,0),
"C vs. F" = c(0,1,0,0,0,0,0,0),
"C vs. G" = c(0,0,1,0,0,0,0,0),
"D vs. F" = c(-1,1,0,0,0,0,0,0),
"D vs. G" = c(-1,0,1,0,0,0,0,1),
"F vs. G" = c(0,-1,1,0,0,0,0,0))
This did not give me reasonable results, prompting me to use built-in
contrasts:
summary(glht(modelEPT, linfct=mcp(wsh="Tukey")))
For riparian contrasts I used:
rip <- rbind("C: Forested vs. Non-Forested" = c(0,0,0,0,1,0,0,0),
"D: Forested vs. Non-Forested" = c(0,0,0,0,1,1,0,0),
"F: Forested vs. Non-Forested" = c(0,0,0,0,1,0,1,0),
"G: Forested vs. Non-Forested" = c(0,0,0,0,1,0,0,1))
I dont understand how these contrasts are defined, but based them on
contrasts on an almost identical design found here:
http://thebiobucket.blogspot.com/2011/06/glmm-with-custom-multiple-comparisons.html#more
Could you please explain how to use user-defined for the model with a
wshrip combined variable? I cant find a clear example of how to do
this. The parameter length is different from the original model now
that there is a combined variable, correct?
Thank you very much, I'd be lost in the dark without this mailing list.
Colin
On Mon, Jan 16, 2012 at 12:58 AM, ONKELINX, Thierry
<Thierry.ONKELINX at inbo.be> wrote:
Dear Colin, I don't think your Tukey tests are very informative. Because you are testing main effects how have an interaction with another variable. So you are not testing the overall effect of a variable but the effect when all other variable are 0 (continuous) or at the reference level (factor). A better way of doing this could bet o create a new variable which is the interaction between wsh and rip and use that in your model instead of wsh * rip. Then you can use glht() with user defined contrasts so that the contrasts test what you want to test. Best regards, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium + 32 2 525 02 51 + 32 54 43 61 85 Thierry.Onkelinx at inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -----Oorspronkelijk bericht----- Van: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] Namens Colin Wahl Verzonden: zondag 15 januari 2012 1:23 Aan: r-sig-mixed-models at r-project.org Onderwerp: [R-sig-ME] Comparing results from glmer and glht I will try to make this concise. Background: I am testing the effects of land use and forested riparian buffers on stream invertebrates and in-stream variables. There are 4 watershed types (defined by 4 types of land use) and two riparian types (forested and non). Percent EPT (relative abundance) was my main response variable. I also measured a variety of in-stream variables like temperature, nutrients, and toxicity. There are 72 observations for invertebrates, and 24 for in-stream variables. I am curious of how acceptable p values are from pairwise comparisons using glht() from the multcomp package I used glmer with a binomial error structure and an observation-level random effect (to account for overdispersion), to model invertebrates: modelEPT<-glmer(EPT ~ wsh*rip + (1|stream) + (1|stream:rip) + (1|obs), data=ept, family=binomial(link="logit")) ? AIC ? BIC logLik deviance ?284.4 309.5 -131.2 ? ?262.4 Random effects: ?Groups ? ? Name ? ? ? ?Variance Std.Dev. ?obs ? ? ? ?(Intercept) 0.30186 ?0.54942 ?stream:rip (Intercept) 0.40229 ?0.63427 ?stream ? ? (Intercept) 0.12788 ?0.35760 Number of obs: 72, groups: obs, 72; stream:rip, 24; stream, 12 Fixed effects: ? ? ? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|) (Intercept) ? -4.2906 ? ? 0.4935 ? -8.694 ?< 2e-16 *** wshd ? ? ? ? ? -2.0557 ? ? 0.7601 ?-2.705 0.00684 ** wshf ? ? ? ? ? ?3.3575 ? ? 0.6339 ? 5.297 ?1.18e-07 *** wshg ? ? ? ? ? 3.3923 ? ? 0.7486 ? ?4.531 ?5.86e-06 *** ripN ? ? ? ? ? ? 0.1425 ? ? 0.6323 ? 0.225 ?0.82165 wshd:ripN ? ? 0.3708 ? ? 0.9682 ? 0.383 ?0.70170 wshf:ripN ? ?-0.8665 ? ? 0.8087 ? -1.071 ?0.28400 wshg:ripN ? ?-3.1530 ? ? 0.9601 ?-3.284 ?0.00102 ** --- Correlation of Fixed Effects: ? ? ? ? ? ? ? ? (Intr) ?wshd ? wshf ? wshg ? ripN ? wshd:N wshf:N wshd ? ? ? ?-0.649 wshf ? ? ? ?-0.779 ?0.505 wshg ? ? ? ?-0.659 ?0.428 ?0.513 ripN ? ? ? ? -0.644 ?0.418 ?0.501 ?0.424 wshd:ripN ?0.421 -0.672 -0.327 -0.277 -0.653 wshf:ripN ?0.503 -0.327 -0.638 -0.332 -0.782 ?0.511 wshg:ripN ?0.424 -0.275 -0.330 -0.632 -0.659 ?0.430 ?0.515 I then used this model to do Tukey's HSD contrasts between watershed types: summary(glht(modelEPT, linfct=mcp(wsh="Tukey"))) Linear Hypotheses: ? ? ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|) d - c == 0 -2.05573 ? ?0.76010 ?-2.705 ? 0.0341 * f - c == 0 ?3.35753 ? ?0.63386 ? 5.297 ? <0.001 *** g - c == 0 ?3.39231 ? ?0.74862 ? 4.531 ? <0.001 *** f - d == 0 ?5.41326 ? ?0.70176 ? 7.714 ? <0.001 *** g - d == 0 ?5.44804 ? ?0.80692 ? 6.752 ? <0.001 *** g - f == 0 ?0.03479 ? ?0.68931 ? 0.050 ? 1.0000 and riparian types: ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|) C: Forested vs. Non-Forested == 0 ? ? ? ? 0.1425 ? ? 0.6323 ? 0.225 ?0.99999 D: Forested vs. Non-Forested == 0 ? ? ? ? 0.5134 ? ? 0.7332 ? 0.700 ?0.98659 F: Forested vs. Non-Forested == 0 ? ? ? ?-0.7239 ? ? 0.5042 ?-1.436 ?0.69625 G: Forested vs. Non-Forested == 0 ? ? ? ?-3.0105 ? ? 0.7225 ?-4.167 ?< 0.001 *** Are these p values accurate? Or is that a personal judgement I have to make based on the clarity of the patterns they reflect? I've shown these results in my figures and explained them in my results. I've basically explained that though these p values reasonably reflect patterns in my data, effects sizes, and variances, that they are inexact and potentially anti-conservative due to the issues with degrees of freedom in mixed models.
From what I understand from my research in the last year is that
Douglas Bates and others advocate something of a paradigm shift away from the petagogically reinforced reliance on cryptic p values toward more in depth discussions of effects sizes and variances. The use of MCMC sampling and HPD intervals are suggested, but these are not available for generalized models. I am interested in publishing these results as an ecologist, not a statistician (pardon the somewhat artificial distinction), and, I am very interested in what kind of a discussion the statisticians and ecologists of the r-sig-mixed-models mailing list would like to see as potential reviewers. Thank you, Colin Wahl M.S. candidate, Dept. of Biology Western Washington University Bellingham, WA
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models