Skip to content

Using GLMs or GLMMs for diversity metrics?

4 messages · Highland Statistics Ltd, Philippi, Tom, highst@t m@iii@g oii highst@t@com +1 more

#
Hello Alana,

Instead of using a diversity index, why not focus on the original 
species using a multivariate GLMM? You can use a generalised linear 
latent variable model (GLLVM) for this. That is a more useful analysis 
as compared to using 4 different diversity indices (which, by the way, 
are all derived from the same data, and that is a problem on itself).

You can find information of GLLVM here:

https://jenniniku.github.io/gllvm/articles/vignette1.html


Or you can join one of our upcoming online workshops on GLLVM:

https://www.highstat.com/Courses/Flyers/Flyer2024_01_SpatTempGLM.pdf

This workshop is in the EU time zone, but we are planning the same 
workshop in the 9 December week in the EST time zone.

The setup of the random effects structure and covariates were already 
discussed by Michael Zyphur, and can be applied in GLLVM as well.

Kind regards,

Alain
2 days later
#
One reason Alana might not want to use GLLVM instead is the repeated sites and repeated years structure.

The standard model for a trend in a single species with multiple sites revisited multiple times (years) from VanLeeuwen et al 1996 and Piepho & Ogutu 2002 includes random effect intercepts and (temporal) slopes among sites, and random effects of temporal (year to year) fluctuations concordant across all sites.  In lme4 formula notation (where yearC is centered continuous year and yearF is year as a factor):
Y ~ yearC + (1 | yearF) + (1 + yearC | siteID)

One of those papers notes that what I denote as "yearC" could just as well be a predictor covariate that varies over years: the general model still needs to account for the correlated / concordant across all sites fluctuations in the covariate.  In this case with an interest in warm v cold years, the model would be:

Y ~ temperature + (1 | yearF) + (1 + temperature | siteID)
with temperature a factor of {warm, cold}

The appropriate "test" of warm v cold years is against year to year fluctuations, lest a wet spring or some other event affecting all sites be interpreted as 6 independent events across the 6 sites.

Unless there are a largish number of years in the study, given the binomial nature of the temperature predictor, with only 6 sites the simpler model might need to be fit:
Y ~ temperature + (1 | yearF) + (1 | siteID)

GLLVM is an awesome tool, but it addresses different questions, and to the best of my knowledge & experimentation with it doesn't accommodate this form of sampling design.  I would love to be corrected by an example of how to specify this random effect structure driven by the sampling process.  And, I would probably include something from GLLVM as an additional perspective on patterns in the data.

That Piepho & Ogutu model in glmer works for counts of a species, and for species richness, via Poisson or negative binomial families.  [brms or glmmTMB may do a better job on the estimation than glmer.]  Diversity indices & evenness are a bit trickier because they are continuous but their error distributions are rarely normal and can be constrained or truncated.

I'm jumping in here because I was dealing with just this issue last week for vegetation monitoring at Santa Monica Mountains National Recreation Area. They have species richness recorded by segments of a single 1x30m transect at each site, and will be testing for trends over time in species richness at the spatial scales of transects, and of 1x5m segments (nested within transects).  Their approach to a Shannon diversity metric from 100 point intercepts at each sites is to fit the P&O model with the additional covariates as normal error, then densityplot of the residuals to see if they are unimodal and close to normally distributed.  If not, they'll use a different error distribution in glmmTMB or brms.  Note that they also have a beta diversity among segments within each site, and could in theory treat that the same way they treat the diversity metric from the point intercepts.

I hope this helps you think about what you are trying to learn about your data, especially the not always obvious artifacts of the sampling that should be accounted for in the analyses, less your results reflect the sampling design rather than the ecological responses.

Irwin, B.J., Wagner, T., Bence, J.R., Kepler, M.V., Liu, W. and Hayes, D.B., 2013. Estimating spatial and temporal components of variation for fisheries count data using negative binomial mixed models. Transactions of the American Fisheries Society, 142(1), pp.171-183.

Piepho, H.P. and Ogutu, J.O., 2002. A simple mixed model for trend analysis in wildlife populations. Journal of agricultural, biological, and environmental statistics, 7, pp.350-360.

VanLeeuwen, D.M., Murray, L.W. and Urquhart, N.S., 1996. A mixed model with both fixed and random trend components across time. Journal of Agricultural, Biological, and Environmental Statistics, pp.435-453.

Tom Philippi
Inventory and Monitoring Program Central Support Office
National Park Service
Tom_Philippi at nps.gov




-----Original Message-----
From: R-sig-ecology <r-sig-ecology-bounces at r-project.org> On Behalf Of Alain Zuur via R-sig-ecology
Sent: Sunday, October 13, 2024 2:41 AM
To: r-sig-ecology at r-project.org; acbarton at mun.ca
Subject: [EXTERNAL] Re: [R-sig-eco] Using GLMs or GLMMs for diversity metrics?
Hello Alana,

Instead of using a diversity index, why not focus on the original species using a multivariate GLMM? You can use a generalised linear latent variable model (GLLVM) for this. That is a more useful analysis as compared to using 4 different diversity indices (which, by the way, are all derived from the same data, and that is a problem on itself).

You can find information of GLLVM here:

https://jenniniku.github.io/gllvm/articles/vignette1.html


Or you can join one of our upcoming online workshops on GLLVM:

https://www.highstat.com/Courses/Flyers/Flyer2024_01_SpatTempGLM.pdf

This workshop is in the EU time zone, but we are planning the same workshop in the 9 December week in the EST time zone.

The setup of the random effects structure and covariates were already discussed by Michael Zyphur, and can be applied in GLLVM as well.

Kind regards,

Alain
_______________________________________________
R-sig-ecology mailing list
R-sig-ecology at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
#
Hello Tom,

The development version of gllvm allows you to include multiple random effects. And it can also do random slopes.??Hence, I think that the models below can be fitted in gllvm. But you may want to double check that with the gllvm-folks (they are super helpful).

With a package available like gllvm, one should really stop doing the more classical multivariate methods.

Personally, I do not like to use year (your yearF) as a random intercept, but that is a different discussion.

Kind regards,

Alain
On 15 Oct 2024 at 20:24 +0200, Philippi, Tom <Tom_Philippi at nps.gov>, wrote:

  
  
#
Dear all,

"gllvm folks" here. Alain is right. You should be able to now incorporate that kind of random effect structure in the (currently dev version of) the gllvm package.

It's a recent addition. If you run into any trouble while implementing such models, shoot us a message and we'll try to help the best we can. There is a vignette available on phylogenetic random effects that (briefly) demonstrates the interface.

Kind regards,
Bert