Cluster-robust SEs & random effects -- seeking some clarification
One further post perhaps framing my question slightly differently (or altogether more generally): What, specifically, do cluster-robust/robust SEs allow one to do with more accuracy/precision *if* they are already using both random effects and slopes to model relevant cluster-specific effects. Is it the case that there may be any number of sources that could potentially account for sources of heteroskedasticity (i.e., autoregressive structure in the case of repeated measurements/time variables) that using the cluster robust SEs would be of value for in making more precise inference assuming some misspecification of the random effects structure of the model? Relatedly, is there a 'seminal' or 'key' paper that provides a deep dive on the concept of heteroskedasticity? I have a few on hand, but wanted to see if there was something I might not be aware of .
On Wed, Aug 3, 2022 at 11:46 PM J.D. Haltigan <jhaltiga at gmail.com> wrote:
Thought I would bump my last post to see if anyone might weigh in on my more general statements about re: unions (clusters). Would be most appreciated as I continue to wrap my head around some things. Thanks in advance for any thoughts. I appreciate your time. On Sat, Jul 30, 2022 at 8:43 PM J.D. Haltigan <jhaltiga at gmail.com> wrote:
And to take one more step: the inference is that the single population-level treatment effect is drawing from a 'random sample' of unions (clusters). That is, one can not assume that union's are the same. They are not interchangeable. They are drawn from a random population. This is the point of the exercise for me. If we remove the random effect for union in the model I shared, we end up with a model in which only the fixed effects of pairID (treatment-control pairs for each pair of treatment control villages) are estimated (albeit using cluster-robust SEs). So, if by adding that random effect for union the treatment intervention is no longer significant (as opposed to a model in which there is no random effect of union modeled), what is that telling us? That some of the between cluster (union) variance in intercepts is contributing to variation in the response variable, yes? I realize you have not read the paper, nor are necessarily interested in this discourse, but any remarks are greatly appreciated. On Sat, Jul 30, 2022 at 8:36 PM Ben Bolker <bbolker at gmail.com> wrote:
Yes. On 2022-07-30 8:12 p.m., J.D. Haltigan wrote:
Thanks, Ben. So in the model you remarked on, would that be a
'random-intercepts only' model?
On Sat, Jul 30, 2022 at 7:53 PM Ben Bolker <bbolker at gmail.com
<mailto:bbolker at gmail.com>> wrote:
I haven't been following the whole thread that carefully, but I
want to
emphasize that
posXsymp~treatment + pairID + (1 | union)
is *not*, by any definition I'm familiar with, a "random-slopes
model";
that is, it only estimates a single population-level treatment
effect/doesn't allow the effect of treatment to vary across groups
defined by 'union'. You would need a random-effect term of the
form
(treatment | union).
Reasons why you might *not* want to do this:
* if treatment only varies across and not within levels of union
("union is nested within treatment" according to some terminology),
then
this variation is unidentifiable
* maybe you have decided that you don't have enough data/want a
more
parsimonious model.
Schielzeth and Forstmeier, among many others (this is the
example I
know of), have cautioned about the consequences of leaving out
random-slopes terms.
Schielzeth, Holger, and Wolfgang Forstmeier. ?Conclusions beyond
Support: Overconfident Estimates in Mixed Models.? Behavioral
Ecology
20, no. 2 (March 1, 2009): 416?20.
https://doi.org/10.1093/beheco/arn145
<https://doi.org/10.1093/beheco/arn145>.
On 2022-07-30 7:43 p.m., J.D. Haltigan wrote:
> Addendum:
>
> It just occurred to me on my walk that I think I am getting a
bit
lost in
> some of the differences in nomenclature across scientific silos.
In the
> original model that they specified, which treated the 'pairID'
variable as
> a control variable for which they controlled for 'fixed
effects' of
> control/treatment villages (in their own language in the paper)
using
> cluster-robust SEs, I think this is indeed a 'random-intercepts
only' model
> in the language of Hamaker et al. They implement the 'absorb'
command in
> STATA which I believe aggregates across the pairIDs to generate
an
> 'omnibus' F-test of sorts for the pairID variable (in the ANOVA
> nomenclature). I say this as when I specify the pairID variable
in the lmer
> model I shared (or in a fixest model I conducted to replicate
the
original
> Abalauck results in R), I get the estimates for all the pairs
(i.e., there
> is no way to aggregate across them--though I think formally the
models are
> the same if we are unconcerned about any one pairID
[treatment/control
> village pair].
>
> So, in the lmer model I shared where I specify a specific random
effects
> term for the 'cluster' variable, I think this indeed is allowing
for random
> slopes across the clusters which implies the treatment effect
may
vary
> across the clusters (and we might anticipate it will for various
reasons I
> can elaborate on). More generally: we are generalizing to *any*
universe of
> villages (say in the entire world) where the treatment
intervention (masks)
> may vary across villages. This is the crux of invoking the
random
effects
> model (i.e., random slopes model).
>
> I realize this is a mouthful, but I think the way these terms
(e.g.,
> random/fixed effects models etc.) are used across disciplines
makes things
> a bit confusing.
>
> On Sat, Jul 30, 2022 at 5:25 PM J.D. Haltigan <
jhaltiga at gmail.com
<mailto:jhaltiga at gmail.com>> wrote:
>
>> This is a very helpful walkthrough, James. My responses are
italicized
>> under yours to maintain thread readability. The key is
Generalizability
>> here and (as I also note in my last reply) the idea is to
Generalize to a
>> universe of "any villages or clusters." That is, the target
population we
>> are generalizing to is *any* random population.
>>
>> On Sat, Jul 30, 2022 at 3:01 PM James Pustejovsky
<jepusto at gmail.com <mailto:jepusto at gmail.com>>
>> wrote:
>>
>>> Hi J.D.,
>>> A few comments/reactions inline below.
>>> James
>>>
>>> On Wed, Jul 27, 2022 at 5:37 PM J.D. Haltigan
<jhaltiga at gmail.com <mailto:jhaltiga at gmail.com>> wrote:
>>>
>>>> ...
>>>>
>>> In the original investigation, the authors did not invoke a
random
>>>> effects model (but did use the pairIDs to control for fixed
effects as
>>>> noted and with robust SEs). Thus, in the original
investigation there was
>>>> *no* specification of a random effects model for the
'cluster'
variable. We
>>>> know from some other work there were some biases in village
mapping and
>>>> other possible sources of between-cluster variation that
might be
>>>> anticipated to have influence--at the random intercepts
level--so we are
>>>> looking into how specifying 'cluster' as a random effect
might
change the
>>>> fixed effects estimates for the treatment intervention
effect.
In the
>>>> Hamaker et al. language, it is indeed a 'random intercepts'
only model.
>>>>
>>>
>>> I don't follow how using a random intercepts model improves
the
>>> generalizability warrant here. The random intercepts model is
essentially
>>> just a re-weighted average of the pair-specific effects in the
original
>>> analysis, where the weights are optimally efficient if the
model is
>>> correctly specified. That last clause carries a lot of weight
here--correct
>>> specification means 1) treatment assignment is unrelated to
the
random
>>> effects, 2) the treatment effect is constant across clusters,
3)
>>> distributional assumptions are valid (i.e., homoskedasticity
at
each level
>>> of the model).
>>>
>>> If the effects are heterogeneous, then I would think that
including
>>> random slopes on the treatment indicator would provide a
better
basis for
>>> generalization. But even then, the warrant is still pretty
vague---what is
>>> the hypothetical population of villages from which the
observed
villages
>>> are sampled?
>>>
>>
>> *In the most basic model (without baseline controls) the model
takes the
>> form: myModel = lmer(posXsymp~treatment + pairID + (1 | union),
data =
>> myData). I believe--correct me if I am wrong--that this
reflects a
>> random-intercepts only model, but I may be mistaken. If I am,
and this is
>> allowing for random slopes on the treatment indicator, then I
will need to
>> rethink my statements. *
>>
>>>
>>>
>>>> Given this, however, does it also make sense to include the
cluster
>>>> robust SEs for the fixed effects which would account for
possible
>>>> heterogeneity of treatment effects (i.e., slopes) across
clusters?s
>>>>
>>>> If you're committed to the random intercepts model, then yes
I
think so
>>> because using cluster robust SEs at least acknowledges the
possibility of
>>> heterogeneous treatment effects.
>>>
>>
>> *If the above model does allow for both random intercepts and
slopes, then
>> perhaps the use of cluster robust SEs is redundant in some
sense
since the
>> random slopes would be modeling the heterogeneity in treatment
effects?*
>>
>>>
>>>
>>>
>>>> Bottom line: in their original analyses, clusters are seen as
>>>> interchangeable from a conceptual perspective (rather than
drawn from a
>>>> random universe of observations). When one scales up evidence
to a universe
>>>> of observations that are random (as they would be in the
intended universe
>>>> of inference in the real-world), then we are better
positioned, I think, to
>>>> adjudicate whether the mask intervention effect is
'practically
>>>> significant' (in addition to whether the focal effect remains
marginally
>>>> significant from a frequentist perspective).
>>>>
>>> As noted above, this argument is a bit vague to me. If there's
concern
>>> about generalizability, then my first question would be: what
is the target
>>> population to which you are trying to generalize?
>>>
>>
>> *Essentially, the target population we are trying to generalize
to is a
>> random selection of villages. Any random selection of villages.
In other
>> words, villages should not be seen as interchangeable. We are
interested in
>> whether the effects generalize to any randomly selected
village. *
>>
>>>
>>>
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models> -- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster
University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
_______________________________________________
R-sig-mixed-models at r-project.org
<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
<https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models>
-- Dr. Benjamin Bolker Professor, Mathematics & Statistics and Biology, McMaster University Director, School of Computational Science and Engineering (Acting) Graduate chair, Mathematics & Statistics