Haldre Rogers <haldre1 at ...> writes:
Hello-
Just a quick follow-up so that the list isn't *completely* silent on the topic. I think the main reason you're not getting any answers is (1) your question is complex (speaking just for myself, I look at it and think "oh, that's kind of hairy -- maybe I'll have a chance to look at that tomorrow ...") (2) I'm pretty sure that your questions are more general linear-model-parameterization questions than specifically (G)LMM questions, so people not be jumping in on the list.
I conducted a field experiment where I added seedlings near and far from conspecific trees at multiple plots at multiple sites on multiple islands and recorded survival, among other things. Here, I'm trying to determine the overall effect of distance (near vs far) on the proportion of seeds that survived in each plot (site is a random effect). I'd also like to determine whether the effect of distance varies by canopy openness. Finally, I would like to test whether the effect of distance differs between islands. I'm confused about how to set up the right contrasts given the interactions, and how to interpret the output. Treatment contrasts use a reference level, which is fine for distance where 'near' is a meaningful reference level, but for 'island', the reference level is not meaningful (i.e. it doesn't make sense to compare each island to one arbitrarily chosen island). A link to my data, and output from an example are below. This is for one species, but I have data for six species total, and ideally, I'd like to compare these effects between species (see question 2 below). I have three specific questions at the bottom.
The quick answer is that you can use sum-to-zero contrasts (contr.sum) for island, if you want, which will make the main effect of distance be the (unweighted) average effect across islands.
Here's a link to the data: https://www.dropbox.com/s/fregbu154zw7whz/aglaifen.csv
> aglaiafen<-read.csv("aglaifen.csv")
> aglaiafen$dist<-factor(aglaiafen$dist, levels=c("near", "far"))
>
> #response is cbind(number of seedlings that survived/number of
seedlings that died), family = binomial
> #Three predictors- island, distance, and canopy openness > #island has 4 levels (A, B, C, D). Island A is reference level by
default, but is not really meaningful as reference level.
> #dist has 2 levels, near and far, with near set as reference level. > #centavgopen is centered at the mean canopy openness. > #site is a random effect (3-5 sites per island, 4 islands). There are
usually 4 near plots and 4 far plots per site.
> > #Here are three approaches to analysis that give qualitatively
similar answers, but I'm not sure which (if any) is the best approach.
>
Sorry, I don't have time to dig through all of this ...
*Questions: *
> # 1) Is one of these methods (or some other method) best for testing
the effect of distance on survival relative to canopy openness and island? Can I conclude that, for this species, there is not a distance effect for island A, B or D, there isn't an interaction between distance and openness, but that there was lower seedling survival in far plots on island C? Is that lower relative only to near plots on islandC? Why don't the coefficients for the interactions change when contrasts are changed?
> # 2) I have similar data for five other species, and I'd like to
compare the magnitude of the distance effect between species. Would adding a species to the model (by adding species*island*distance and species*openness*distance) be advisable, or is it better to just analyze each species separately to avoid the challenges with interpreting three-way interactions, especially in glmer's where options for hypothesis testing are more limited than lm's or glm's? Again, I don't really want to know the effect of one species relative to a reference species, but simply compare the magnitude of the distance effect between species- how do I set up my model to do that? I can share the full dataset offline if that would help.
Well, the three-way interaction *is* the magnitude of the difference among species. You could use drop1() to test the overall effect of the interaction (i.e., the overall magnitude of among-island differences in the island*distance and openness*distance effects)
> # 3) Is the use of confint(model, method="Wald") acceptable for
testing this? I tried using bootMer a few different ways (e.g. bootMer(survm1, nsim=1000), using a couple different functions), but all bootstrap runs fail each time I've tried. Likelihood ratio tests can tell me whether a factor should be included in a model, but not differences between levels of factors (e.g. islands or species). In addition, results from LRT's are similar to those obtained using confint. Setting up the glht (multcomp package) for this, given all of the interactions and the continuous openness variable seems a bit overwhelming.
If confint() results are similar to *summary()* results then confint(model,method="Wald") is OK ... results from LRTs and those using confint(model,method="profile") (i.e. the default) *should* be identical ...