[R-meta] Meta-analysis questions

Hi Bethan,

I'll respond to your first questions and add a query of my own.

First, non-normality and high kurtosis are indeed problems for the modeling
approach you've taken. One consequence is that you will likely have very
high levels of estimated heterogeneity. Another consequence, and potential
concern, is that the standard errors and confidence intervals that come
from rma.mv() should probably not be trusted because they are predicated on
assumptions about normality of the random effects in the specified model.
There are (at least) two ways that you could deal with this issue. First,
you could ignore the standard errors from rma.mv() and instead use robust
variance estimation methods, which are asymptotically robust to
non-normality as well as mis-specification of the model covariance
structure. For example, the following will give you robust confidence
intervals for your model:

conf_int(Traitcat_model, vcov = "CR2")

An alternative to RVE is to use bootstrap confidence intervals, as you've
attempted. However, the usual implementation of bootstrapping will not work
here. Because you've got dependent effect sizes with a multi-level
structure, you'll need to bootstrap re-sample *at the study level* instead
of re-sampling individual observations. Here's a rough sketch of how to
cluster bootstrap:

library(boot)

study_ids <- unique(Traitcatdata$Study_number)

boot.func <- function(study_ids, indices) {

  row_indices <- Traitcatdata$Study_number %in% study_ids[indices]

    Traitcat_model2 <-
try(suppressWarnings(rma.mv(yi=Ln_response_corrected,V=Variance,
data=Traitcatdata, mods=~LnSR:Trait_cat - 1, test="t", random =
list(~1|Study_number/Response_id/Effect_size_id), method = "REML",
subset=row_indices)))

    if (inherits(Traitcat_model2, "try-error")) {

    NA

  } else {

    c(coef(Traitcat_model2), Traitcat_model2$sigma2)

  }

}

res.boot2 <- boot(study_ids, boot.func, R=5000)

Cluster-bootstrapped confidence intervals will usually give you results
that are quite similar to the RVE approach.

To your question about reporting the bootstrap CIs with the original model
estimates versus with "bootstrapped estimates", I assume that "bootstrapped
estimates" means the average (arithmetic mean) of the bootstrap
distribution. Usually these should be quite close to the point estimates
from the original model, particularly with a linear model such as yours. If
they're discrepant, then something weird is going on that probably warrants
seeking help from a statistician.

And an additional question for you: I see in your model specification that
you've dropped the intercepts, so for each trait category, you're modeling
the slope of the relationship between the log response ratio and the log of
the temperature differential. By dropping the intercept, you are assuming
that the magnitude of the effect size is multiplicatively related to the
magnitude of the temperature differential (i.e., if you double the
log-temperature difference, you should expect to get twice as large a log
response ratio). By interacting the slope with trait category, you're
allowing this multiplicative relationship to differ for each trait
category. But then you're also including random effects in the model, which
are assumed to have a constant variance on the scale of the log response
ratio and across trait categories. Let's ignore the hierarchical structure
for the moment and just think about one effect per study. For a given trait
category, the model would be

LRR_i = beta * log( temp diff )_i + v_i + e_i

where e_i is the sampling error with known variance Var(e_i) = V_i and v_i
is a random effect with variance Var(v_i) = tau^2. Note that this model
assumes that the degree of heterogeneity is constant across temperature
differentials, so the degree of heterogeneity in a set of studies that all
looked at very small temperature differentials is the same as the degree of
heterogeneity in a set of studies that all examined very large temperature
differentials. Does that make theoretical sense in your scientific context?
(This is an honest question--I don't know anything about your research
area!)

Alternatively, I wonder whether it might be plausible to assume that the
degree of heterogeneity is *also* multiplicatively related to the magnitude
of the temperature differential. Under that assumption, you would divide
the effect sizes and their standard errors by the log of the temperature
differential, so the new model for a given category would become

[ LRR_i / log( temp diff )_i ] = beta + v_i + e_i,

where now Var(e_i) = V_i / [log( temp diff )_i]^2 and Var(v_i) = tau^2.
Here, the variance parameter tau^2 represents heterogeneity in the response
*per unit change* in the log temperature differential. This implies that
studies with larger log temperature differentials would have
proportionately larger degrees of heterogeneity in their responses. To take
into account multiple trait categories, you could fit a model that has
indicators for each category (dropping the intercept) but you would no
longer necessarily need to include th interaction with lnSR.

James

On Sun, Sep 12, 2021 at 8:42 PM Bethan Lang <bethan.lang at my.jcu.edu.au>
wrote:

[R-meta] Meta-analysis questions

Thread (3 messages)