Skip to content

[R-meta] Questions about multilevel meta-analysis structure

6 messages · Isaac Calvin Saywell, Reza Norouzian, James Pustejovsky

#
Hello all,

I am conducting a multilevel meta-analysis, trying to compare effect sizes across 9 different cognitive domains (using cognitive domain as a categorical moderator). In my dataset some studies provide effect sizes for multiple cognitive domains (dependent effects), some provide only one effect size per cognitive domain, and not all studies contain effect sizes for all cognitive domains of interest. I have four questions:

1. Is my model correctly structured to account for dependency using the inner | outer formula (see MODEL 1 CODE below) or should I just specify random effects at the study and unique effect size level (see MODEL 2 CODE below).

# MODEL 1 CODE

## res <- rma.mv(vi,
                        V,
                        mods = ~ cog_domain,
                        random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
                        struct = "UN",
                        tdist = TRUE,
                        method = "REML",
                        data = dat)

# MODEL 2 CODE

## res <- rma.mv(vi,
                        V,
                        mods = ~ cog_domain,
                        random = list(~ 1 | study_id, ~ 1 | effectsize_id),# removed inner | outer formula
                        struct = "UN",
                        tdist = TRUE,
                        method = "REML",
                        data = dat)

2. If I do need to specify an inner | outer formula to compare effect sizes across cognitive domains, then is an unstructured variance-covariance matrix ("UN") most appropriate (allowing tau^2 to differ among cognitive domains) or should another structure be specified?

3. To account for effect size dependency is a variance-covariance matrix necessary (this is what my model currently uses) or is it ok to use sampling variance of each in the multilevel model.

4. When subsetting my data by one cognitive domain and investigating this same cognitive domain in a univariate multilevel model the effect estimate tends to be lower compared to when all cognitive domains are included in a single multilevel model as a moderator, is there a reason for this?

Help with any of these questions would be greatly appreciated.

Kind regards,
Isaac

University of Adelaide, Australia
Cognitive Neural Sciences Lab
#
Hi Isaac,

Comments inline below. (You've hit on something I'm interested in, so
apologies in advance!)

James

On Thu, Jul 20, 2023 at 12:17?AM Isaac Calvin Saywell via
R-sig-meta-analysis <r-sig-meta-analysis at r-project.org> wrote:

            
The syntax looks correct to me except for two things. First, the first
argument of each model should presumably be yi = yi rather than vi. Second,
in Model 2, the struct argument is not necessary and will be ignored (it's
only relevant for models where the random effects have inner | outer
structure).

Conceptually, this is an interesting question. Model 1 is theoretically
appealing because it uses a more flexible, general structure than Model 2.
Model 1 is saying that there are different average effects for each
cognitive domain, and each study has a unique set of effects per cognitive
domain that are distinct from each other but can be inter-correlated. In
contrast, Model 2 is saying that the study-level random effects apply
equally to all cognitive domains---if study X has higher-than-average
effects in domain A, then it will have effects in domain B that are equally
higher-than-average.

The big caveat with Model 2 is that it can be hard to fit unless you have
lots of studies, and specifically lots of studies that report effects for
multiple cognitive domains. To figure out if it is feasible to estimate
this model, it can be useful to do some descriptives where you count the
number of studies that include effect sizes from each possible *pair* of
cognitive domains. If some pairs have very few studies, then it's going to
be difficult or impossible to fit the multivariate random effects structure
without imposing further restrictions.

In case it's looking infeasible, there are some other random effects
structures that are intermediate between Model 1 and Model 2, which might
be worth trying:
Model 1.0: random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
struct = "UN"
Model 1.1: random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
struct = "HCS"
Model 1.2: random = list(~ cog_domain | study_id, ~ 1 | effectsize_id),
struct = "CS"
Model 1.2 (equivalent specification, I think): random = ~ 1 | study_id /
cog_domain / effectsize_id
Model 2.0: random = list(~ 1 | study_id, ~ 1 | effectsize_id)
Model 2.0 (equivalent specification): random = ~ 1 | study_id /
effectsize_id

So perhaps there is something in between 1.0 and 2.0 that will strike a
balance between theoretical appeal and feasibility.
This has been discussed previously on the listserv. My perspective is that
you should use whatever assumptions are most plausible. If you expect that
there really is correlation in the sampling errors (e.g., because the
effect size estimates are based on correlated outcomes measured on the same
set of respondents), then I think it is more defensible to use a
non-diagonal V matrix, as in your current syntax.
Is this true for *all* of the cognitive domains or only one or a few of
them? Your Model 1 and Model 2 use random effects models that assume effect
sizes from different cognitive domains are somewhat related (i.e., the
random effects are correlated within study) and so the average effect for a
given domain will be estimated based in part on the effect size estimates
for that domain and in part by "borrowing information" from other domains
that are correlated with it. Broadly speaking, the consequence of this
borrowing of information is that the average effects will tend to be pulled
toward each other, and thus will be a little less dispersed than if you
estimate effects through subgroup analysis.

The above would explain why some domains would get pulled downward in the
multivariate model compared to the univariate model, but it would not
explain why *all* of the domains are pulled down. If it's really all of
them, then I suspect your data must have some sort of association between
average effect size and the number of effect size estimates per study.
That'd be weird and I'm not really sure how to interpret it. You could
check on this by calculating a variable (call it k_j) that is the number of
effect size estimates reported per study (across any cognitive domain) and
then including that variable as a predictor in Model 1 or Model 2 above.
This would at least tell you if there's something funky going on...

As a bit of an aside, you can do the equivalent of a subgroup analysis
within the framework of a multivariate working model, which might be
another thing to explore to figure out what's going on. To do this, you'll
first need to recalculate your V matrix, setting the subgroup argument to
be equal to cog_domain. This amounts to making the assumption that there is
correlation between effect size estimates *within* the same domain but not
between domains of a given study. Call this new V matrix V_sub. Then try
the following model specifications:

Model 2.1: V = V_sub, random = list(~ cog_domain | study_id, ~ cog_domain |
effectsize_id), struct = c("DIAG","DIAG")
Model 2.2: V = V_sub, random = list(~ cog_domain | study_id, ~ 1 |
effectsize_id), struct = "DIAG",

Model 2.1 should reproduce what you get from running separate models by
subgroup.
Model 2.2 is a slight tweak on that, which assumes that there is a common
within-study, within-subgroup variance instead of allowing this to differ
by subgroup. Model 2.2 is nested in Models 1.0 and 1.1, but not in 1.2.
#
James' responses are right on. I typed this up a bit ago so instead of
dumping them I put them here in case they might be helpful.

In general, modeling effect sizes may often depend at least on a couple of
things. First, what are study goals/objectives? For example, would that be
one of your study goals/objectives to understand the extent of
relationships that exists among the true effects associated with your 9
different cognitive domains? Does such an understanding help you back an
existing theoretical/practical view up or bring up a new one to the fore?

If yes, then potentially one of ?~inner | outer? type formulas in your
model could to some extent help.

Second, do you have empirical support to achieve your study goal? This one
essentially explains why I hedged a bit (?potentially?, ?one of?, ?to some
extent?) toward the end when describing the first goal above. Typically,
the structure of the data that you have collected could determine which (if
any) of the available random-effects structures can lend empirical support
to your initial goal.

Some of these structures like UN allow you to tap into all the existing
bivariate relationships between your 9 different cognitive domains. But
that comes with a requirement. Those 9 cognitive domains must have
co-occurred in a good number of the studies you have included in your
meta-analysis. To the extent that this is not the case, you may need to
simplify your random-effects structures using alternatively available
structures (CS, HCS etc.).

Responses to your questions are in-line below.

1. Is my model correctly structured to account for dependency using the
inner | outer formula (see MODEL 1 CODE below) or should I just specify
random effects at the study and unique effect size level (see MODEL 2 CODE
below).

Please see my introductory explanation above. But please also note that
?struct=? only works with formulas that are of the form ?~inner | outer?
where inner is something other than intercept (other than ~1). Thus, UN
is entirely ignored in model 2.

2. If I do need to specify an inner | outer formula to compare effect sizes
across cognitive domains, then is an unstructured variance-covariance
matrix ("UN") most appropriate (allowing tau^2 to differ among cognitive
domains) or should another structure be specified?

Please see my introductory explanation above.

3. To account for effect size dependency is a variance-covariance matrix
necessary (this is what my model currently uses) or is it ok to use
sampling variance of each in the multilevel model.

I?m assuming you?re referring to V. You?re not currently showing the
structure of V. See also James' response.

4. When subsetting my data by one cognitive domain and investigating this
same cognitive domain in a univariate multilevel model the effect estimate
tends to be lower compared to when all cognitive domains are included in a
single multilevel model as a moderator, is there a reason for this?

See James? answer.


On Thu, Jul 20, 2023 at 9:53?AM James Pustejovsky via R-sig-meta-analysis <
r-sig-meta-analysis at r-project.org> wrote:

            

  
  
3 days later
#
Hi James and Reza,

Thank you both for your detailed responses, they have provided more clarity on multilevel modelling and cleared up any possible misunderstandings I had.

My team and I have decided, in line with both of your suggestions, that "HCS" is the most appropriate model variance structure for our data (given there are many studies that don't include effects for all cognitive domains).

Only a couple of cognitive domains get pulled downward in the multivariate model, where most effect estimates remain quite accurate. When testing the models, you suggested for the equivalent of subgroup analyses the effect estimates for the cognitive domains that were pulled downward were much closer to the effects in univariate models.

My last question then would be should we be specifying cognitive domain as our subgroup for when imputing a variance-covariance matrix for our multilevel moderator model or is this not appropriate? Therefore, would the following code be suitable?

## V <- impute_covariance_matrix(vi = dat$variance, cluster = dat$study_id, r = 0.6, subgroup = dat$cog_domain)
##
## res <- rma.mv(yi,
                        V,
                        mods = ~ cog_domain,
                        random = list(~ cog_domain | study_id, ~ 1 | unique_id),
                        struct = "HCS",
                        tdist = TRUE,
                        method = "REML",
                        data = dat)

Thank you to both of you again for sharing your expertise, it has been highly appreciated.

Kind regards,

Isaac
#
Hi Isaac,

If you think the effect size estimates from different sampling domains will
have correlated sampling errors, then I think it would make more sense to
use the V matrix without subgroups. In my previous reply, I suggested using
the subgroup argument in impute_covariance_matrix() (or the equivalent,
metafor::vcalc()) only as a trick for doing the equivalent of a subgroup
analysis, so that there is no borrowing of information across subgroups.
However, given that you're using a model that *does* involve borrowing (due
to use of struct = "HCS"), there's not the same rationale for removing the
between-subgroup correlations in the sampling errors. As in my previous
reply, I think the general principle is to use whatever assumptions are
most plausible.

James

On Sun, Jul 23, 2023 at 8:59?PM Isaac Calvin Saywell <
isaac.saywell at adelaide.edu.au> wrote:

            

  
  
#
Isaac,

You don't need the "subgroup = dat$cog_domain" part in your
impute_covariance_matrix() call.

V_sub would be necessary, if you were doing a subgroup analysis using
a single model such as Model 2.1 and Model 2.2 where your cog_domain
could vary in the studies.

Once you decide (e.g., based on better model fit relative to other
candidate models) to adopt a multivariate model (letting true effects
for the cognitive domains to have a joint distribution in the
studies), then V_sub which serves to make the sampling errors
associated with the cognitive domains independent in each study
becomes irrelevant.

Please also take a look at the archives as I believe you can find
multiple useful posts discussing several other relevant issues such as
using cluster robust inferences (p values, CIs) associated with the
average effects of your cognitive domains in your current output.
Alternatively, you can check out metafor's help page related to this
issue by doing: ?robust.rma.mv

Kind regards,
Reza




On Sun, Jul 23, 2023 at 8:59?PM Isaac Calvin Saywell
<isaac.saywell at adelaide.edu.au> wrote: