Skip to content
Prev 3404 / 5632 Next

[R-meta] Coding Longitudinal Studies

Dear Danielle,

To be clear, your current model has the following random-effect
specification (notice I have numbered each term):

random = list(
    1-         ~1                 | StudyNUM,
    2-         ~Time_NUM | interaction(StudyNUM, Cohort),
    3-         ~1                 | interaction(StudyNUM, Cohort, ExTreat),
    4-         ~1                 | interaction(StudyNUM, Cohort,
ExTreat,Time_NUM))

Now let's get to your questions.

random = list(~ 1|StudyNUM), # variation in effects between studies,
each study has their own intercept?

Yes, each study can have its own average (study-level average effect),
and thus we expect some variation among them at this level.

~ Time_NUM | interaction(StudyNUM, Cohort), # variation in effects due
to cohorts (i.e study design: cross-over or independent cohort) within
each study (within_study heterogeneity)?

I'm not following you here, my understanding was that "cohort" refers
to independent samples of participants studied independently in each
study. But it seems by cohort you mean two totally different study
designs employed by each study. Please clarify?

That aside, here true effect sizes at different time points are
correlated with each other for each cohort within a study. You can
further specify a structure beyond the default one (struct = "CS") for
this correlation (perhaps anything from "HAR" to "AR" to "UN" or even
"HCS" depending on the fit).

~ 1 || interaction(StudyNUM, Cohort, ExTreat)), # variation in effect
is due to treatments within cohorts of individual studies? What is the
"||" indicate? Does this mean that they are not the same, i.e
different treatments?

First, `||` is currently reserved for when you set the `struct="GEN"`
(and perhaps a few other undocumented structures), it is meant to act
like struct = "DIAG" for a categorical variable that appears before
`||` i.e., no correlation between levels of "some variable" or kill
the correlation between slopes and intercepts given a continuous
variable that appears before `||`.

Second, you have not specified that "some variable" before `||` so
essentially, it is not relevant to your model and is completely
ignored.

This entire term assumes that true effect sizes aggregated at the
'ExTreat" level within each cohort can vary around their respective
cohort-level aggregates. Thus, this captures variation within a given
cohort in your data.

~ 1 | interaction(StudyNUM, Cohort, ExTreat, Time_NUM)), #variation in
effect due to timing  of their measurements within different
treatments within cohorts within studies (repeated  measures)?
 struct = c("UN","UN") )

First, struct = c("UN","UN") is not relevant to this random term,
because there are no variables before two instances of `|` and thus
correlations among the levels of two "nothing" variables are
completely ignored.

This entire term assumes that individual true effect sizes (that is
true representation of your observed effect sizes in each row) can
vary around their specific ExTreat-level aggregate within each
ExTreat. Thus, this captures variation within a given ExTreat in your
data.

Now, why should there be variation in a given ExTreat in your data?
Because, there might be studies that have subjected their participants
in each ExTreat to multiple measurements perhaps over time (repeated
measurements), or on different outcomes (math, reading etc.).

2) when coding the dataframe should I give each individual row (effect
size) a ID? As the timepoint ID don't represent the same thing (e.g.,
time == 1 in study 1 isn't the same thing as time == 1 in study 2). Or
as I have nested the timepoints within the study this should be ok?

Adding ID is generally a helpful practice (e.g., for removing the
outliers). But for your data:

 rowID  =  interaction(StudyNUM, Cohort, ExTreat, Time_NUM)

In the case of time, it is fine if time == 1 in study 1 isn't the same
thing as time == 1 in study 2. However, it DOES matter whether the
"amount of time" passed up to say time 1 in study vs. that in study 2
are the same or not.

IFF you add time as a categorical variable (otherwise you run into
multi-collinearity), then you can add a control variable to account
for that in your data. If you can't find that info. in the majority of
studies, then, at least control for the total length of each study (in
weeks, months etc.).

3) If I get 0 for the variance components (I.e sigma^2.3) would this
indicate that I do not need to include these in the model as it is not
explaining any variability?

Your last random-effect term has shown to be an overfit as it has
returned 0 variability within a given Extreat in your data (you either
don't have some many studies whose Extreat has repetition in it OR if
you do, there is no much variation in them to demand an additional
level). As such, you can remove ~1 | interaction(StudyNUM, Cohort,
ExTreat,Time_NUM) from your random-effect specification.

Kind regards,
Reza


On Thu, Oct 7, 2021 at 12:45 AM Danielle Hiam
<danielle.hiam at deakin.edu.au> wrote:
Message-ID: <CAKt3tzk50fZdT+dqnOi99tEwMuCyN9p=TOSvGLYLOsE+1DZZyg@mail.gmail.com>
In-Reply-To: <ME3PR01MB54646D86D18FD102CAEC3B9DB7B19@ME3PR01MB5464.ausprd01.prod.outlook.com>