[R-meta] phylogenetic information in both moderator and random part of rma.mv? - R-SIG-meta-analysis

Wed, Sep 28, 2022 12:03 PM #

Dear list,
I want to test for evolutionary divergence (among ectothermic animals) in a single trait Y. My first approach to this was to test for divergence among taxonomic classes (specifically amphibians, reptiles, insects, fish, and crustaceans). I compiled data on several species per class, and multiple estimates of Y per species (from different experiments), and analysed the data using a traditional mixed effects model (lme), with species as a random effect and taxonomic class as fixed effect.

However, one reviewer suggested I should control for phylogeny (without being more specific). So I built a tree for these species (using package rotl), made it ultrametric (using compute.brlen in package ape), and computed the variance-covariance matrix A from this (using vcv from package ape). I then created a variable phylo to distinguish the phylogenetic component from the non-phlylogenetic species random effect, and used metafor to fit the model:

rma.mv(Y, sampvar,mods = ~ Class, random = list(~ 1 | species, ~ 1 | phylo),
               R = list(phylo = A), data = Final.data, sparse = TRUE, method = "ML")

(sampvar is the sampling variance associated with each estimate of Y)

However, now I have started doubting whether this model makes sense, i.e. to estimate an effect of taxonomic class (which in essence is a phylogenetic effect) while simultaneously modelling a random effect of phylogeny. Would an appropriate alternative be to not have class as a moderator, but rather compare fits of models with and without the phylogenetic variance-covariance matrix. If the model including phylogeny is better than one without it, can I conclude that there is evolutionary divergence in this trait?

Any advice anyone might have on this would be much appreciated!

Best wishes
Sigurd Einum

Wolfgang Viechtbauer

Thu, Sep 29, 2022 7:58 AM #

Dear Sigurd,

I do not know enough about the specifics of the application to say whether comparing the model with versus without phylogeny is sufficient to conclude something about evolutionary divergence.

However, let me make a few comments (I am also basing some comments here on what you wrote to me initially in an email before redirecting you to this mailing list for further discussions):

So you fitted a mixed-effects model with lme() of the form:

lme(Y ~ Class, random = ~ 1 | species, data=Final.data, method="ML")

(or maybe including weights = varFixed(~ sampvar)) and found 'Class' to be relevant (regardless of whether this means: large coefficient, statistically significant, sufficiently larger value of the information criterion compared to a null model). Then you fitted the model below (where you are accounting for phylogeny) and now the support for the relevance of 'Class' disappears or is considerably weaker. I hope this summarizes the issue.

First, I would ask: Have you compared the lme() model with this?

rma.mv(Y, sampvar, mods = ~ Class, random = ~ 1 | species, data = Final.data, sparse = TRUE, method = "ML")

Note that this is not exactly the same model as what lme() fits as explained here:

https://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer

This aside, it is quite possible that the relevance / effect / significance of 'Class' changes when accounting for the phylogeny, because this accounts for the potential dependence of the outcome due to similarities between species in a different way than just including species itself as a random effect.

Now does it make sense to include Class as a moderator while also including random effects for species? I would say yes. Class is a broader category, so the species random effect accounts for heterogeneity within classes (and the fixed effect for class allows the average of all species belonging to the same class to differ from that of other classes). And whether the values of this random effect are correlated or not (as predicted by the phylogeny) is an empirical question. So if the model with phylogeny fits better, then I would go with that.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis [mailto:r-sig-meta-analysis-bounces at r-project.org] On
Behalf Of Sigurd Einum
Sent: Wednesday, 28 September, 2022 21:04
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] phylogenetic information in both moderator and random part of
rma.mv?

Dear list,

I want to test for evolutionary divergence (among ectothermic animals) in a
single trait Y. My first approach to this was to test for divergence among
taxonomic classes (specifically amphibians, reptiles, insects, fish, and
crustaceans). I compiled data on several species per class, and multiple
estimates of Y per species (from different experiments), and analysed the data
using a traditional mixed effects model (lme), with species as a random effect
and taxonomic class as fixed effect.

However, one reviewer suggested I should control for phylogeny (without being
more specific). So I built a tree for these species (using package rotl), made it
ultrametric (using compute.brlen in package ape), and computed the variance-
covariance matrix A from this (using vcv from package ape). I then created a
variable phylo to distinguish the phylogenetic component from the non-
phlylogenetic species random effect, and used metafor to fit the model:

rma.mv(Y, sampvar,mods = ~ Class, random = list(~ 1 | species, ~ 1 | phylo),
              R = list(phylo = A), data = Final.data, sparse = TRUE, method =
"ML")

(sampvar is the sampling variance associated with each estimate of Y)

However, now I have started doubting whether this model makes sense, i.e. to
estimate an effect of taxonomic class (which in essence is a phylogenetic effect)
while simultaneously modelling a random effect of phylogeny. Would an appropriate
alternative be to not have class as a moderator, but rather compare fits of
models with and without the phylogenetic variance-covariance matrix. If the model
including phylogeny is better than one without it, can I conclude that there is
evolutionary divergence in this trait?

Any advice anyone might have on this would be much appreciated!

Best wishes
Sigurd Einum

Sigurd Einum

Fri, Sep 30, 2022 1:25 AM #

Dear Wolfgang, thank you for these insights! This clarified for me the way effects of class and phylogeny (within class) is partitioned in the model! 
When I fit the two models you suggested (see below), AICc is lower for mod2 (with delta AICc > 2). My interpretation of your reply, i.e. that the effect of phylogeny is an empirical question, is that since there is little evidence for a phylogenetic signal within classes for this data set, mod2 (i.e. treating the species as independent observations as long as I account for the random species effect (1|species)) is appropriate when estimating the class effect. 

mod1 <- rma.mv(lambda, sampvar,mods = ~ Class, random = list(~ 1 | study, ~ 1 | species, ~ 1 | phylo),
               R = list(phylo = A), data = Final.data, sparse = TRUE, method = "REML")

mod2 <- rma.mv(lambda, sampvar,mods = ~ Class, random = list(~ 1 | study, ~ 1 | species),
                                   data = Final.data, sparse = TRUE, method = "REML")

Best wishes,
Sigurd

-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: Thursday, September 29, 2022 4:59 PM
To: r-sig-meta-analysis at r-project.org
Cc: Sigurd Einum <sigurd.einum at ntnu.no>
Subject: RE: phylogenetic information in both moderator and random part of
rma.mv?

Dear Sigurd,

I do not know enough about the specifics of the application to say whether
comparing the model with versus without phylogeny is sufficient to conclude
something about evolutionary divergence.

However, let me make a few comments (I am also basing some comments here
on what you wrote to me initially in an email before redirecting you to this
mailing list for further discussions):

So you fitted a mixed-effects model with lme() of the form:

lme(Y ~ Class, random = ~ 1 | species, data=Final.data, method="ML")

(or maybe including weights = varFixed(~ sampvar)) and found 'Class' to be
relevant (regardless of whether this means: large coefficient, statistically
significant, sufficiently larger value of the information criterion compared to a
null model). Then you fitted the model below (where you are accounting for
phylogeny) and now the support for the relevance of 'Class' disappears or is
considerably weaker. I hope this summarizes the issue.

First, I would ask: Have you compared the lme() model with this?

rma.mv(Y, sampvar, mods = ~ Class, random = ~ 1 | species, data = Final.data,
sparse = TRUE, method = "ML")

Note that this is not exactly the same model as what lme() fits as explained here:

https://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer

This aside, it is quite possible that the relevance / effect / significance of 'Class'
changes when accounting for the phylogeny, because this accounts for the
potential dependence of the outcome due to similarities between species in a
different way than just including species itself as a random effect.

Now does it make sense to include Class as a moderator while also including
random effects for species? I would say yes. Class is a broader category, so the
species random effect accounts for heterogeneity within classes (and the fixed
effect for class allows the average of all species belonging to the same class to
differ from that of other classes). And whether the values of this random effect
are correlated or not (as predicted by the phylogeny) is an empirical question. So
if the model with phylogeny fits better, then I would go with that.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis
[mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Sigurd
Einum
Sent: Wednesday, 28 September, 2022 21:04
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] phylogenetic information in both moderator and random
part of rma.mv?

Dear list,

I want to test for evolutionary divergence (among ectothermic animals)
in a single trait Y. My first approach to this was to test for
divergence among taxonomic classes (specifically amphibians, reptiles,
insects, fish, and crustaceans). I compiled data on several species per
class, and multiple estimates of Y per species (from different
experiments), and analysed the data using a traditional mixed effects
model (lme), with species as a random effect and taxonomic class as fixed

effect.

However, one reviewer suggested I should control for phylogeny (without
being more specific). So I built a tree for these species (using
package rotl), made it ultrametric (using compute.brlen in package
ape), and computed the variance- covariance matrix A from this (using
vcv from package ape). I then created a variable phylo to distinguish
the phylogenetic component from the non- phlylogenetic species random

effect, and used metafor to fit the model:

rma.mv(Y, sampvar,mods = ~ Class, random = list(~ 1 | species, ~ 1 | phylo),
              R = list(phylo = A), data = Final.data, sparse = TRUE,
method =
"ML")

(sampvar is the sampling variance associated with each estimate of Y)

However, now I have started doubting whether this model makes sense,
i.e. to estimate an effect of taxonomic class (which in essence is a
phylogenetic effect) while simultaneously modelling a random effect of
phylogeny. Would an appropriate alternative be to not have class as a
moderator, but rather compare fits of models with and without the
phylogenetic variance-covariance matrix. If the model including
phylogeny is better than one without it, can I conclude that there is evolutionary

divergence in this trait?

Any advice anyone might have on this would be much appreciated!

Best wishes
Sigurd Einum

Wolfgang Viechtbauer

Fri, Sep 30, 2022 2:44 AM #

Sounds reasonable.

One additional thing: I don't know what your data structure actually looks like. Is each row a different study? Or is each row a different species? If not, then I would strongly suggest to include '~ 1 | id' where

Final.data$id <- 1:nrow(Final.data)

as a random effect (in both mod1 and mod2). Otherwise you would be assuming homogeneity of true outcomes within studies/species, which is a strong assumption.

Best,
Wolfgang

-----Original Message-----
From: Sigurd Einum [mailto:sigurd.einum at ntnu.no]
Sent: Friday, 30 September, 2022 10:26
To: Viechtbauer, Wolfgang (NP); r-sig-meta-analysis at r-project.org
Subject: RE: phylogenetic information in both moderator and random part of
rma.mv?

Dear Wolfgang, thank you for these insights! This clarified for me the way
effects of class and phylogeny (within class) is partitioned in the model!
When I fit the two models you suggested (see below), AICc is lower for mod2 (with
delta AICc > 2). My interpretation of your reply, i.e. that the effect of
phylogeny is an empirical question, is that since there is little evidence for a
phylogenetic signal within classes for this data set, mod2 (i.e. treating the
species as independent observations as long as I account for the random species
effect (1|species)) is appropriate when estimating the class effect.

mod1 <- rma.mv(lambda, sampvar,mods = ~ Class, random = list(~ 1 | study, ~ 1 |
species, ~ 1 | phylo),
              R = list(phylo = A), data = Final.data, sparse = TRUE, method =
"REML")

mod2 <- rma.mv(lambda, sampvar,mods = ~ Class, random = list(~ 1 | study, ~ 1 |
species),
                                  data = Final.data, sparse = TRUE, method =
"REML")

Best wishes,
Sigurd

-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: Thursday, September 29, 2022 4:59 PM
To: r-sig-meta-analysis at r-project.org
Cc: Sigurd Einum <sigurd.einum at ntnu.no>
Subject: RE: phylogenetic information in both moderator and random part of
rma.mv?

Dear Sigurd,

I do not know enough about the specifics of the application to say whether
comparing the model with versus without phylogeny is sufficient to conclude
something about evolutionary divergence.

However, let me make a few comments (I am also basing some comments here
on what you wrote to me initially in an email before redirecting you to this
mailing list for further discussions):

So you fitted a mixed-effects model with lme() of the form:

lme(Y ~ Class, random = ~ 1 | species, data=Final.data, method="ML")

(or maybe including weights = varFixed(~ sampvar)) and found 'Class' to be
relevant (regardless of whether this means: large coefficient, statistically
significant, sufficiently larger value of the information criterion compared to

null model). Then you fitted the model below (where you are accounting for
phylogeny) and now the support for the relevance of 'Class' disappears or is
considerably weaker. I hope this summarizes the issue.

First, I would ask: Have you compared the lme() model with this?

rma.mv(Y, sampvar, mods = ~ Class, random = ~ 1 | species, data = Final.data,
sparse = TRUE, method = "ML")

Note that this is not exactly the same model as what lme() fits as explained

here:

https://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer

This aside, it is quite possible that the relevance / effect / significance of

'Class'

changes when accounting for the phylogeny, because this accounts for the
potential dependence of the outcome due to similarities between species in a
different way than just including species itself as a random effect.

Now does it make sense to include Class as a moderator while also including
random effects for species? I would say yes. Class is a broader category, so the
species random effect accounts for heterogeneity within classes (and the fixed
effect for class allows the average of all species belonging to the same class

to

differ from that of other classes). And whether the values of this random effect
are correlated or not (as predicted by the phylogeny) is an empirical question.

So

if the model with phylogeny fits better, then I would go with that.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis
[mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of Sigurd
Einum
Sent: Wednesday, 28 September, 2022 21:04
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] phylogenetic information in both moderator and random
part of rma.mv?

Dear list,

I want to test for evolutionary divergence (among ectothermic animals)
in a single trait Y. My first approach to this was to test for
divergence among taxonomic classes (specifically amphibians, reptiles,
insects, fish, and crustaceans). I compiled data on several species per
class, and multiple estimates of Y per species (from different
experiments), and analysed the data using a traditional mixed effects
model (lme), with species as a random effect and taxonomic class as fixed

effect.

However, one reviewer suggested I should control for phylogeny (without
being more specific). So I built a tree for these species (using
package rotl), made it ultrametric (using compute.brlen in package
ape), and computed the variance- covariance matrix A from this (using
vcv from package ape). I then created a variable phylo to distinguish
the phylogenetic component from the non- phlylogenetic species random

effect, and used metafor to fit the model:

rma.mv(Y, sampvar,mods = ~ Class, random = list(~ 1 | species, ~ 1 | phylo),
              R = list(phylo = A), data = Final.data, sparse = TRUE,
method =
"ML")

(sampvar is the sampling variance associated with each estimate of Y)

However, now I have started doubting whether this model makes sense,
i.e. to estimate an effect of taxonomic class (which in essence is a
phylogenetic effect) while simultaneously modelling a random effect of
phylogeny. Would an appropriate alternative be to not have class as a
moderator, but rather compare fits of models with and without the
phylogenetic variance-covariance matrix. If the model including
phylogeny is better than one without it, can I conclude that there is

evolutionary

divergence in this trait?

Any advice anyone might have on this would be much appreciated!

Best wishes
Sigurd Einum

Sigurd Einum

Fri, Sep 30, 2022 4:06 AM #

There are (usually) several rows (observations of Y and associated sampling variance) for each species, as well as for each study. Hence the random structure   ~ 1|study, ~ 1|species. 

I tried adding the id variable and the 1|id random structure as you recommended (model estimates of class parameters remain basically unaltered, whereas AICc drops a lot), but I don't really understand what this achieves, and I have never used or encountered the use of such a structure with each observed data point being assigned a separate random effect in mixed models (lme/lmer) previously. Is it somehow related to the weight= varIdent in lme, which allows for different random effects for different groups of observations? But in this case the model allows for different random effects for each single observation? It is probably beyond me to understand this (I'm not a statistician), which is fine, but if possible I would like to try, so if you can point me to a resource where I could read up on this that would be great. I read through this https://wviechtb.github.io/metafor/reference/rma.mv.html but still not sure I got it. 

Thanks again for all your helpful advice!
Sigurd

-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: Friday, September 30, 2022 11:45 AM
To: r-sig-meta-analysis at r-project.org
Cc: Sigurd Einum <sigurd.einum at ntnu.no>
Subject: RE: phylogenetic information in both moderator and random part of
rma.mv?

Sounds reasonable.

One additional thing: I don't know what your data structure actually looks like. Is
each row a different study? Or is each row a different species? If not, then I
would strongly suggest to include '~ 1 | id' where

Final.data$id <- 1:nrow(Final.data)

as a random effect (in both mod1 and mod2). Otherwise you would be assuming
homogeneity of true outcomes within studies/species, which is a strong
assumption.

Best,
Wolfgang

-----Original Message-----
From: Sigurd Einum [mailto:sigurd.einum at ntnu.no]
Sent: Friday, 30 September, 2022 10:26
To: Viechtbauer, Wolfgang (NP); r-sig-meta-analysis at r-project.org
Subject: RE: phylogenetic information in both moderator and random part
of rma.mv?

Dear Wolfgang, thank you for these insights! This clarified for me the
way effects of class and phylogeny (within class) is partitioned in the model!
When I fit the two models you suggested (see below), AICc is lower for
mod2 (with delta AICc > 2). My interpretation of your reply, i.e. that
the effect of phylogeny is an empirical question, is that since there
is little evidence for a phylogenetic signal within classes for this
data set, mod2 (i.e. treating the species as independent observations
as long as I account for the random species effect (1|species)) is appropriate

when estimating the class effect.

mod1 <- rma.mv(lambda, sampvar,mods = ~ Class, random = list(~ 1 |
study, ~ 1 | species, ~ 1 | phylo),
              R = list(phylo = A), data = Final.data, sparse = TRUE,
method =
"REML")

mod2 <- rma.mv(lambda, sampvar,mods = ~ Class, random = list(~ 1 |
study, ~ 1 | species),
                                  data = Final.data, sparse = TRUE,
method =
"REML")

Best wishes,
Sigurd

-----Original Message-----
From: Viechtbauer, Wolfgang (NP)
<wolfgang.viechtbauer at maastrichtuniversity.nl>
Sent: Thursday, September 29, 2022 4:59 PM
To: r-sig-meta-analysis at r-project.org
Cc: Sigurd Einum <sigurd.einum at ntnu.no>
Subject: RE: phylogenetic information in both moderator and random
part of rma.mv?

Dear Sigurd,

I do not know enough about the specifics of the application to say
whether comparing the model with versus without phylogeny is
sufficient to conclude something about evolutionary divergence.

However, let me make a few comments (I am also basing some comments
here on what you wrote to me initially in an email before redirecting
you to this mailing list for further discussions):

So you fitted a mixed-effects model with lme() of the form:

lme(Y ~ Class, random = ~ 1 | species, data=Final.data, method="ML")

(or maybe including weights = varFixed(~ sampvar)) and found 'Class'
to be relevant (regardless of whether this means: large coefficient,
statistically significant, sufficiently larger value of the
information criterion compared to

null model). Then you fitted the model below (where you are accounting
for
phylogeny) and now the support for the relevance of 'Class' disappears
or is considerably weaker. I hope this summarizes the issue.

First, I would ask: Have you compared the lme() model with this?

rma.mv(Y, sampvar, mods = ~ Class, random = ~ 1 | species, data =
Final.data, sparse = TRUE, method = "ML")

Note that this is not exactly the same model as what lme() fits as
explained

here:

https://www.metafor-project.org/doku.php/tips:rma_vs_lm_lme_lmer

This aside, it is quite possible that the relevance / effect /
significance of

'Class'

changes when accounting for the phylogeny, because this accounts for
the potential dependence of the outcome due to similarities between
species in a different way than just including species itself as a random effect.

Now does it make sense to include Class as a moderator while also
including random effects for species? I would say yes. Class is a
broader category, so the species random effect accounts for
heterogeneity within classes (and the fixed effect for class allows
the average of all species belonging to the same class

to

differ from that of other classes). And whether the values of this
random effect are correlated or not (as predicted by the phylogeny) is an

empirical question.

So

if the model with phylogeny fits better, then I would go with that.

Best,
Wolfgang

-----Original Message-----
From: R-sig-meta-analysis
[mailto:r-sig-meta-analysis-bounces at r-project.org] On Behalf Of
Sigurd Einum
Sent: Wednesday, 28 September, 2022 21:04
To: r-sig-meta-analysis at r-project.org
Subject: [R-meta] phylogenetic information in both moderator and
random part of rma.mv?

Dear list,

I want to test for evolutionary divergence (among ectothermic
animals) in a single trait Y. My first approach to this was to test
for divergence among taxonomic classes (specifically amphibians,
reptiles, insects, fish, and crustaceans). I compiled data on several
species per class, and multiple estimates of Y per species (from
different experiments), and analysed the data using a traditional
mixed effects model (lme), with species as a random effect and
taxonomic class as fixed

effect.

However, one reviewer suggested I should control for phylogeny
(without being more specific). So I built a tree for these species
(using package rotl), made it ultrametric (using compute.brlen in
package ape), and computed the variance- covariance matrix A from
this (using vcv from package ape). I then created a variable phylo to
distinguish the phylogenetic component from the non- phlylogenetic
species random

effect, and used metafor to fit the model:

rma.mv(Y, sampvar,mods = ~ Class, random = list(~ 1 | species, ~ 1 | phylo),
              R = list(phylo = A), data = Final.data, sparse = TRUE,
method =
"ML")

(sampvar is the sampling variance associated with each estimate of Y)

However, now I have started doubting whether this model makes sense,
i.e. to estimate an effect of taxonomic class (which in essence is a
phylogenetic effect) while simultaneously modelling a random effect
of phylogeny. Would an appropriate alternative be to not have class
as a moderator, but rather compare fits of models with and without
the phylogenetic variance-covariance matrix. If the model including
phylogeny is better than one without it, can I conclude that there is

evolutionary

divergence in this trait?

Any advice anyone might have on this would be much appreciated!

Best wishes
Sigurd Einum