Tom
*From:* highstat at highstat.com <highstat at highstat.com>
*Sent:* Tuesday, October 15, 2024 11:56 AM
*To:* r-sig-ecology at r-project.org; acbarton at mun.ca; Philippi, Tom
<Tom_Philippi at nps.gov>
*Cc:* tephilippi <tephilippi at gmail.com>; Bert van der Veen
<bert.v.d.veen at ntnu.no>
*Subject:* RE: [EXTERNAL] Re: [R-sig-eco] Using GLMs or GLMMs for
diversity metrics?
Hello Tom,
The development version of gllvm allows you to include multiple random
effects. And it can also do random slopes.??Hence, I think that the
models below can be fitted in gllvm. But you may want to double check
that with the gllvm-folks (they are super helpful).
With a package available like gllvm, one should really stop doing the
more classical multivariate methods.
Personally, I do not like to use year (your yearF) as a random
intercept, but that is a different discussion.
Kind regards,
Alain
On 15 Oct 2024 at 20:24 +0200, Philippi, Tom <Tom_Philippi at nps.gov>,
wrote:
One reason Alana might not want to use GLLVM instead is the
repeated sites and repeated years structure.
The standard model for a trend in a single species with multiple
sites revisited multiple times (years) from VanLeeuwen et al 1996
and Piepho & Ogutu 2002 includes random effect intercepts and
(temporal) slopes among sites, and random effects of temporal
(year to year) fluctuations concordant across all sites. In lme4
formula notation (where yearC is centered continuous year and
yearF is year as a factor):
Y ~ yearC + (1 | yearF) + (1 + yearC | siteID)
One of those papers notes that what I denote as "yearC" could just
as well be a predictor covariate that varies over years: the
general model still needs to account for the correlated /
concordant across all sites fluctuations in the covariate. In this
case with an interest in warm v cold years, the model would be:
Y ~ temperature + (1 | yearF) + (1 + temperature | siteID)
with temperature a factor of {warm, cold}
The appropriate "test" of warm v cold years is against year to
year fluctuations, lest a wet spring or some other event affecting
all sites be interpreted as 6 independent events across the 6 sites.
Unless there are a largish number of years in the study, given the
binomial nature of the temperature predictor, with only 6 sites
the simpler model might need to be fit:
Y ~ temperature + (1 | yearF) + (1 | siteID)
GLLVM is an awesome tool, but it addresses different questions,
and to the best of my knowledge & experimentation with it doesn't
accommodate this form of sampling design. I would love to be
corrected by an example of how to specify this random effect
structure driven by the sampling process. And, I would probably
include something from GLLVM as an additional perspective on
patterns in the data.
That Piepho & Ogutu model in glmer works for counts of a species,
and for species richness, via Poisson or negative binomial
families. [brms or glmmTMB may do a better job on the estimation
than glmer.] Diversity indices & evenness are a bit trickier
because they are continuous but their error distributions are
rarely normal and can be constrained or truncated.
I'm jumping in here because I was dealing with just this issue
last week for vegetation monitoring at Santa Monica Mountains
National Recreation Area. They have species richness recorded by
segments of a single 1x30m transect at each site, and will be
testing for trends over time in species richness at the spatial
scales of transects, and of 1x5m segments (nested within
transects). Their approach to a Shannon diversity metric from 100
point intercepts at each sites is to fit the P&O model with the
additional covariates as normal error, then densityplot of the
residuals to see if they are unimodal and close to normally
distributed. If not, they'll use a different error distribution in
glmmTMB or brms. Note that they also have a beta diversity among
segments within each site, and could in theory treat that the same
way they treat the diversity metric from the point intercepts.
I hope this helps you think about what you are trying to learn
about your data, especially the not always obvious artifacts of
the sampling that should be accounted for in the analyses, less
your results reflect the sampling design rather than the
ecological responses.
Irwin, B.J., Wagner, T., Bence, J.R., Kepler, M.V., Liu, W. and
Hayes, D.B., 2013. Estimating spatial and temporal components of
variation for fisheries count data using negative binomial mixed
models. Transactions of the American Fisheries Society, 142(1),
pp.171-183.
Piepho, H.P. and Ogutu, J.O., 2002. A simple mixed model for trend
analysis in wildlife populations. Journal of agricultural,
biological, and environmental statistics, 7, pp.350-360.
VanLeeuwen, D.M., Murray, L.W. and Urquhart, N.S., 1996. A mixed
model with both fixed and random trend components across time.
Journal of Agricultural, Biological, and Environmental Statistics,
pp.435-453.
Tom Philippi
Inventory and Monitoring Program Central Support Office
National Park Service
Tom_Philippi at nps.gov
-----Original Message-----
From: R-sig-ecology <r-sig-ecology-bounces at r-project.org> On
Behalf Of Alain Zuur via R-sig-ecology
Sent: Sunday, October 13, 2024 2:41 AM
To: r-sig-ecology at r-project.org; acbarton at mun.ca
Subject: [EXTERNAL] Re: [R-sig-eco] Using GLMs or GLMMs for
diversity metrics?
Message: 1
Date: Fri, 11 Oct 2024 18:25:38 -0400
From: "Barton, Alana Charlotte" <acbarton at mun.ca>
To: r-sig-ecology at r-project.org
Subject: [R-sig-eco] Using GLMs or GLMMs for diversity metrics?
Message-ID:
<CAP+=Be1jJ_qrckg7TC8fd7NFbwU=8Z4fkUoX-itrgjeNzRDBMQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hello,
I would appreciate some help in a question regarding
statistical analysis.
I'm looking at species count data where sampling was carried
out over
multiple years in repeated sites. So each year was sampled at six
different sites for example. The years were categorized into a
temperature group with two factors:warm or cold. However, I'm only
interested in exploring community differences between temp.
groups and
across years. I used the vegan package in R for calculating
diversity
metrics(abundance, richness, diversity index, evenness) and
want to
statistically check differences among metrics from factors of
group and year.
I have been using the manyglm-mvabund package with negative
binomial
distribution, but there is the issue that mvabund doesn't fit
non-integer data well, and I'm worried its incorrectly computing
diversity and evenness stats. Additionally, I'm wondering if the
repeated sites should be added as a fixed effect to mitigate
this? Or
if it's even considered a random effect actually and a mixed
model is
more appropriate, using glmmTMB instead in this case? I'm not
terribly familiar with using mixed models in R so any help is
appreciated.
Thank you for your help
Hello Alana,
Instead of using a diversity index, why not focus on the original
species using a multivariate GLMM? You can use a generalised
linear latent variable model (GLLVM) for this. That is a more
useful analysis as compared to using 4 different diversity indices
(which, by the way, are all derived from the same data, and that
is a problem on itself).
You can find information of GLLVM here:
https://jenniniku.github.io/gllvm/articles/vignette1.html
Or you can join one of our upcoming online workshops on GLLVM:
https://www.highstat.com/Courses/Flyers/Flyer2024_01_SpatTempGLM.pdf
This workshop is in the EU time zone, but we are planning the same
workshop in the 9 December week in the EST time zone.
The setup of the random effects structure and covariates were
already discussed by Michael Zyphur, and can be applied in GLLVM
as well.
Kind regards,
Alain