Skip to content

[R-meta] nesting an inner | outer formula

6 messages · Wolfgang Viechtbauer, Ross Neville

#
Hi Wolfgang

I hope this email finds you well

I was wondering if you could tell me whether the following list of random
effects can be updated to make the the inner | outer formula conditional.

random = list (~ 1 | StudyID, ~ Informant | TreatmentGroup)

I would like for the different levels of Informant to be correlated within
TreatmentGroup within StudyID, and I would like the different levels of
TreatmentGroup to be correlated within StudyID too.

In SAS Proc Mixed, I've managed to run this model and I want to replicate
it in metafor rma.mv.

random StudyID;
random Informant/subject=StudyID, group=TreatmentGroup type=un;
parms  1  1 1 1  1 1 1  1 1 1  1 1 1  1/hold=14;

Any help you can provide to let me know if this is possible in rma.mv would
be much appreciated.

Regards
Ross
#
Dear Ross,

my proc mixed knowledge is a bit rusty, but unless I am confused, your proc mixed statement specifies a random intercept for StudyID and an UN structure for Informant within StudyID allowing for different variances/covariances for the different levels of TreatmentGroup.
I don't think this quite matches up with your description:
For example, there is nothing in your proc mixed statement that allows for "TreatmentGroup to be correlated within StudyID". Also, the second random statement allows for Informant to be correlated within StudyID, but *not* "within TreatmentGroup within StudyID".

So before I attempt to recreate the same structure, it would need to be clear exactly what kind of structure you really want.

Best,
Wolfgang
#
Dear Wolfgang

Thanks for the speedy response, and for seeking clarification and
correcting my error.

Rather than try to correct my interpretation of the SAS code, perhaps it
would make more sense to tell you what structure I really want.

The data structure is such that I have studies (*StudyID*) reporting
post-intervention means for children in an experimental and control group (
*TreatmentGroup*). The variable *Informant* tells us who is reporting on
behalf of the child. Some studies have child report only (so two rows for
such a StudyID corresponding to the post-intervention means for the
experimental and control group). Studies with parent report or teacher
report only are the same (two rows). There are also studies where there is
data from child and parent, child and teacher, teacher and parent, or even
child parent and teacher. So, essentially, a StudyID could have two rows,
four rows, or six rows, depending on how many Informants there are.
Children are in the experimental or control group only, so one would expect
Informants to be nested and correlated within TreatmentGroup within studies.

Because of the data structure (multiple rows of sample means rather than
fewer rows of pairwise comparisions) the degree to which control and
intervention group means are more or less similar in a given StudyID is
capture in part (maybe even large part) by ~ 1 | StudyID.

What is missing from this random = list (~ 1 | StudyID, ~ Informant |
TreatmentGroup) is the fact that the inner | outer is saying, for example,
that parents in the experimental or control group across studies share
correlated random effects. When, in fact, one would expect parents,
children, and teachers in the experimental or control group to share
correlated random effects within a given study.

Perhaps given the data structure, you would advise something else. Or
perhaps I am still being unclear in my description and understanding.

Regards
Ross


On Fri, 14 Feb 2025 at 13:00, Viechtbauer, Wolfgang (NP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

            

  
    
3 days later
#
Dear Ross,

Thanks for the clarification. Based on this, I would consider the following structure:

random = ~ 1 | StudyID / TreatmentGroup / Informant

This captures overall differences in the outcomes (across the experimental and control groups and across all informants) between studies. It also allows for heterogeneity in how much experimental and control groups differ from each other across studies (I assume you will add something like mods = ~ TreatmentGroup to the model, since presumably you are interested in the size of the (average) difference between the two groups). And it allows for heterogeneity in outcomes that arises due to some studies using multiple informants. By nesting Informant within TreatmentGroup, this model automatically implies a certain degree of correlation in the true outcomes for different informants within experimental/control groups. This last random effect is also the 'outcome level' random effect, since based on your description, every combination of StudyID, TreatmentGroup, and Informant should yield a unique value for each row.

Strictly speaking, this structure does not capture the correlation in the sampling errors that arises because multiple informants are reporting on the *same children*. If reports, say from parents and teachers, are correlated, then this implies that the sampling errors of the means are also correlated. Such a correlation (or more precisely, the covariance) should go into the V matrix. However, to compute this covariance, you would need to know what the correlation (r) between the parent and teacher reports. This is probably not reported, but maybe can be guestimated from other or your own studies. Say the data for the first two studies looks like this:

Study  Group  Informant   Outcome
1      Exp    Child       .
1      Ctrl   Child       .
2      Exp    Parent      .
2      Exp    Teacher     .
2      Ctrl   Parent      .
2      Ctrl   Teacher     .

Then the corresponding V matrix would be (use a fixed width font to view this so that things are lined up properly):

[s_1E^2/n_1E                                                                            ]
[            s_1C^2/n_1C                                                                ]
[                        s_2EP^2/n_2E r*s_2EP*s_2ET/n_2E                                ]
[                                     s_2ET^2/n_2E                                      ]
[                                                        s_2CP^2/n_2C r*s_2CP*s_2CT/n_2C]
[                                                                     s_2CT^2/n_2C      ]

where s stands for standard deviation, n for sample size, and the subscripts are for study, group, and informant in that order (single letter abbreviations). Elements left blank are equal to 0 (zero covariance). I did not put a subscript on r since this is probably a single guestimated value across all studies (and informant pairs). Such a V matrix can be easily generated using the vcalc() function.

Since V is probably just going to be an approximation (especially if you decide not to bother creating the V matrix, which in essence implies assuming r=0), I would then consider using cluster-robust inference methods (robust(model, cluster=StudyID, clubSandwich=TRUE)) at least as a sensitivity check.

I think the above is a good and sensible starting point. A more complex structure might be:

dat$StudyID.TreatmentGroup <- paste0(dat$StudyID, ".", dat$TreatmentGroup)
random = list(~ TreatmentGroup | StudyID, ~ Informant | StudyID.TreatmentGroup), struct="UN")

This would allow for different variances for experimental and control groups and it would allow for different variances for the different types of informants and allow the correlation in the random effects to differ depending on the type of informant pair (child-parent, child-teacher, parent-teacher, etc.). But I would only attempt to fit this model if there is plenty of data. One could use a LRT to compare the two model structures. There is also a structure with intermediate complexity by using struct=c("UN","HCS"), where we assume different variances for the different informants, but a single correlation irrespective of the pair.

One might also consider TreatmentGroup and Informant to be crossed random effects (within studies), but I think this is overcomplicating things.

Best,
Wolfgang
#
Thanks Wolfgang
I will take some time to digest the contents. I appreciate the detail and
guidance.
Quickly, the suggested model provides a sigma^2.2 variance of 0, as shown
below.
This makes me feel like the random effects random = ~ 1 | StudyID /
TreatmentGroup / Informant are too complex for the available data.
Thought on that?
[image: Screenshot 2025-02-18 at 12.38.17.png]

On Tue, 18 Feb 2025 at 12:34, Viechtbauer, Wolfgang (NP) <
wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:

            

  
    
#
Given the 'nlvls' values shown, you appear to have enough data to get fairly accurate estimates (you could check confint(model) to see how tight the CIs are). So I don't think the model is too complex. It could very well be that there isn't much heterogeneity in the Condition effect.

You could also check this by computing the mean difference within studies for Exp versus Control (if there are multiple informats, do this for every type). Strictly speaking again, these mean differences are not independent (since mean_trt_parent - mean_ctrl_parent and mean_trt_teacher - mean_ctrl_teacher involves reports on the same children by parents and teachers), but if you ignored V in the model results you have shown, then we can do the same here). You should find that those mean differences are fairly consistent (at least not more variable than would be expected based on their sampling variances).

Best,
Wolfgang