Skip to content

[R-meta] Does clubSandwich::coef_test() handle crossed random-effects?

16 messages · Farzad Keyhan, Wolfgang Viechtbauer, James Pustejovsky

#
Hello Wolfgang and James,

I have 51 studies, but when I run an unconditional model (below),
coef_test() outputs a df of 4.76 (makes no sense).

I suspected that coef_test() is not picking up my highest-level
cluster (study). So, I made that explicit in the call. And then, I got
an error saying "random effects are not nested".

First, does clubSandwich handle models with crossed random-effects? If
not, is there an alternative code for rma.mv() models to accomplish
similar things (I wanted to do egger's test as well)--and of course
the small sample correction may not be available there.

Thanks, Fred

# Note 'scale' is crossed with 'study'
m1 <- rma.mv(yi, V, random = list(~1|study/es, ~1|scale), data=data)

coef_test(m1, "CR2")
    Coef. Estimate    SE t-stat d.f. p-val (Satt) Sig.
1 intrcpt    0.506 0.193   2.63 4.76       0.0489    *

## I made cluster explicit but get an error:
coef_test(m1, "CR2", cluster = data$study)
*** Error: Random effects are not nested within clustering
#
Hi Fred,

The dfs you see are computed based on a Satterthwaite approximation. Depending on the data structure, the dfs could very well be much lower than the number of studies/estimates.

This aside, a limitation of the cluster robust approach is that it can only capture dependencies that occur within the groups defined by the clustering variable. Or put in a different way, for the approach to work, there need to be a sufficient (for some definition of "sufficient") number of independent groups within the dataset. When introducing crossed random effects or other random effects that can create dependencies across the entire dataset (like spatial correlation or correlation due to phylogeny in meta-analyses in ecology and evolution involing multiple species), then there might only be a single (or very few) subsets in the data that are independent and then the cluster robust approach cannot really be used anymore.

Best,
Wolfgang
#
Hi Farzad,

clubSandwich does not support clustering in models with crossed random effects. 

There are generalizations of CRVE that work under crosses dependence structures (see Cameron, Gelbach, & Miller, 2011; Cameron & Miller, 2015). But the small-sample corrections implemented in clubSandwich have yet to be worked out for multi-way clustering. Thus, these generalizations only work when there are sufficiently large numbers of clusters *in each clustering dimension*. 

Based on currently available methods and software, I would suggest sticking with model-based inference (as implemented in rma.mv) if your interest is in a model with crossed random effects. That means you?d either have to do the work to develop a defensible V matrix and/or conduct sensitivity analysis for whatever assumptions go into constructing the V matrix.

James

Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust inference with multiway clustering. Journal of Business & Economic Statistics, 29(2), 238-249.

Cameron, A. C., & Miller, D. L. (2015). A practitioner?s guide to cluster-robust inference. Journal of human resources, 50(2), 317-372.

  
  
#
Hi James and Wolfgang,

Thank you both. Please note that 'scale' doesn't vary within each 'study'.
Thus, to generalize beyond 'scale' levels, the only way was to add it as a
crossed random-effect.

James, does that make sense to construct a V based only on 'study' cluster
using 'impute_covariance_matrix' but then conduct sensitivity on 'r'?

Thanks,
Fred
On Sun, Oct 3, 2021, 8:25 AM James Pustejovsky <jepusto at gmail.com> wrote:

            

  
  
#
Ah I see. In my previous message, I think I misinterpreted what was going
on. Based on what you've just described, it sounds like you actually do not
have crossed random effects. Rather, you've got a hierarchical structure
where effect sizes are nested within studies and studies are nested within
scales. If you do not specify a clustering variable, clubSandwich defaults
to clustering based on the highest level of random effects, so in your case
you were clustering on the scale ID. (You could confirm this by running
coef_test(m1, "CR2", cluster = data$scale), to see if the results are
identical to coef_test(m1, "CR2").) If there are only a few unique values
of the scale variable (I would guess maybe around 6 or 7?) then this would
also explain the low degrees of freedom.

So, in addition to the approach described in my previous message, a further
alternative would be to treat scale as a set of fixed effects (i.e., dummy
variables for each level of scale). and then cluster on studyID. Strictly
speaking, this means that you would not be generalizing beyond the observed
set of scales. But if you only observe 6 unique levels, trying to do so may
be asking for more than the data can support.

James
On Sun, Oct 3, 2021 at 8:41 AM Farzad Keyhan <f.keyhaniha at gmail.com> wrote:

            

  
  
#
Hi James,

Interesting! I envisioned my data structure to be as shown below. Also,
there is a bit of model fit advantage for:

random = list(~1|study / es, ~1|scale)  over  random = ~1| scale/study/es

Which would perhaps mean that true effect sizes in each row are nested in
both studies & scales and can share random-effects within and across both
studies and formulas whenever their 'study' or 'scale' ID indicators match.

Doesn't this seem more like a crossed random-effects structure?

Fred

[image: image.png]
On Sun, Oct 3, 2021 at 11:44 AM James Pustejovsky <jepusto at gmail.com> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211003/814a99c5/attachment-0001.html>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 28256 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211003/814a99c5/attachment-0001.png>
#
Fred,

Your diagram does not make sense. If you are treating scale as an ID
variable, then each unique value of scale should only be represented one
time in the bottom row of the diagram. If you consolidate the scale IDs,
then it looks to me that you'll end up with a hierarchical structure where
studies are nested within scales. This structure would also be consistent
with how you've described your data ("Please note that 'scale' doesn't vary
within each 'study'.").

Given that, I'm not sure why you would get a difference in fit between random
= list(~1|study / es, ~1|scale) versus random = ~1| scale/study/es.

James
On Sun, Oct 3, 2021 at 12:20 PM Farzad Keyhan <f.keyhaniha at gmail.com> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211003/8a76ac8d/attachment-0001.html>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 28256 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211003/8a76ac8d/attachment-0001.png>
#
I see, I'm still exploring to see what has caused the two models in my
previous email to give slightly different fits. Still curious though, for
'scale' and 'study' to have been crossed random effects, 'scale' should
have varied in each study?

Fred
On Sun, Oct 3, 2021 at 12:27 PM James Pustejovsky <jepusto at gmail.com> wrote:

            
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211003/639e2c78/attachment-0001.html>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 28256 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-meta-analysis/attachments/20211003/639e2c78/attachment-0001.png>
#
On Sun, Oct 3, 2021 at 1:18 PM Farzad Keyhan <f.keyhaniha at gmail.com> wrote:

            
Yes.
#
Dear James,

I explored the issue, there was a re-coding bug. One thing that I wanted to
clarify is that in addition to the 'scale > study' nesting relationship,
the same 'scale' was used to measure different 'outcomes' and different
'scales' can be used to measure the same 'outcome' across the studies.

Do you see any potential for crossed random-effects here? (data attached
for clarity)

Fred

dat <- read.csv("https://raw.githubusercontent.com/ilzl/i/master/j.csv")

  study scale       yi        vi es group outcome time
1    A1    p1 1.680746 0.2081713  1     1       3    0
2    A1    p1 4.122057 0.4806029  2     2       3    0
3    A1    p1 2.600443 0.2838905  3     1       3    1
4    A1    p1 3.457194 0.3836960  4     2       3    1
5    A1    p1 1.546293 0.1998273  5     1       3    2
6    A1    p1 3.071523 0.3352741  6     2       3    2
On Sun, Oct 3, 2021 at 6:59 PM James Pustejovsky <jepusto at gmail.com> wrote:

            

  
  
2 days later
#
Dear James,

One quick question, (recall I have 'scales' subsuming 'studies' subsuming
'true effects'). In this case, to set up a V matrix, should I use 'study'
as or 'scale' to define the 'cluster' argument in
'impute_covariance_matrix()'?

Thanks,
Fred
On Sun, Oct 3, 2021 at 9:25 PM Farzad Keyhan <f.keyhaniha at gmail.com> wrote:

            

  
  
#
Hi Fred,
The cluster argument in impute_covariance_matrix describes sets of effect
sizes that you expect to have correlated sampling errors, which arise if
multiple effect sizes are estimated from a common sample (or from partially
overlapping samples). So in your case, use cluster = study.
James
On Wed, Oct 6, 2021 at 9:03 PM Farzad Keyhan <f.keyhaniha at gmail.com> wrote:

            

  
  
#
Many thanks, you mean the cluster that "directly and immediately" contains
the true and subsequently overlapping observed effects, not the ones higher
up in the hierarchy, that is the logic, correct?

Fred
On Wed, Oct 6, 2021 at 9:21 PM James Pustejovsky <jepusto at gmail.com> wrote:

            

  
  
#
I don't know what "directly and immediately" means. I mean clusters where
the sampling errors (or errors of estimation), defined as the difference
between the effect size estimate and its target parameter, are correlated.

James
On Wed, Oct 6, 2021 at 9:26 PM Farzad Keyhan <f.keyhaniha at gmail.com> wrote:

            

  
  
#
Sure, I think I meant the same thing, I meant the cluster that contains
individual effects not higher clusters that contain aggregated effects.

Thanks very much,
Fred
On Wed, Oct 6, 2021 at 9:30 PM James Pustejovsky <jepusto at gmail.com> wrote:

            

  
  
#
Just to distinguish V elements from correlated random effects elements,
when making a multivariate model due to say variable C_ijk (recall I have
'scales' subsuming 'studies' subsuming 'true effects'), then we can assume
Cov(u_{bjk}, u_{cjk}) to be our correlated random effects at the 'study'
level, but could also assume Cov(u_{bk}, u_{ck}) as our correlated random
effect at the 'scale' level, correct?
On Wed, Oct 6, 2021 at 10:04 PM Farzad Keyhan <f.keyhaniha at gmail.com> wrote: