Controlling for self-selection bias / endogeneity in mixed models - R-SIG-mixed-models

Sun, Apr 12, 2020 4:34 PM #

Hi all -

I have a concern regarding self-selection/omitted variable bias. I have a longitudinal/repeated measures model, theorizing about a relationship between treatment/control and effort, represented in nlme syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience), random = ~1|subject

Treatment/control is selected by the subject, it is not randomized, thus raising endogeneity concerns. My background is applied econ, so as I learn the mixed model domain, I expected to find the mixed model equivalent of instrumental variables/inverse Mills ratio, etc. Yet there is surprisingly (to me) limited material addressing this issue. The best reference material I found is in fact a thread in this mailing list from October 2016 and the papers referenced within, leading to Bell, Fairbrother, and Jones (2019). My first impression is that I should employ a within-between random effects (REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) + experience_between + experience_within, random = experience_within + scale(experience) | subject

If I understand correctly, the intuition is that the addition of a group mean explanatory variable "breaks out" the variability that would be associated with an omitted variable / error term. Per Bell et al, "there can be no correlation between level 1 variables included in the model and the level 2 random effects...unchanging and/or unmeasured characteristics of an individual (such as intelligence, ability, etc.) will be controlled out of the estimate of the within effect."

So, no concern between the subject (level 2) and treatment (level 1) via REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2 characteristics can cause bias in the estimates of between effects and effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include subject to address the IID violation but am interested in population, not subject, performance.

Bell et al continue, "However, unobserved time-varying characteristics can still cause biases at level 1 in either an FE or a REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can change across time within a subject), as a practical/empirical matter, the treatment is unchanging within the subject - subjects have no reason to change / would prefer to keep the choice constant. Of 80k records, treatment switches within a subject occur in about a dozen records.

So, I think I have my solution. However, if a reviewer is not happy with the with-in / between REWB solution (worried about the level 1 bias), I can further defend EQ 2 via its random coefficient/slope, if I understand the Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against self-selection bias, could someone provide an intuitive explanation as to why? Is the idea that by allowing slopes to vary, there is no endogeneity problem to solve as the very structure of the model makes the correlated errors concern irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et al, the Mundlak models are not meaningful for repeated measures. Also, it appears that the brms package appears to support mixed modeling using instrumental variables, something I am more comfortable with per my background, but strong instrumental variables are hard to find in the wild!

Thank you! - Kelly

Poe, John

Sun, Apr 12, 2020 5:36 PM #

Hi Kelly,

It sounds like you've got correct reasoning on the need for a multilevel
model if your variable of interest is time invariant.

Can you post a link to the thread you're referencing?

A bit of clarity on the flavor(s) of endogeneity that concern you might be
helpful. The omitted variable bias issues solved by group mean centering
and the Mundlak device are mostly from model mis/underspecification whereas
sample selection is a fundamentally different mechanism. Both are common
sources of endogeneity recognized as such in different pockets of econ but
they tend to be seen as fundamentally different (often conceptually
unrelated) problems in other fields. Econ subsumes omitted variables, joint
causation, measurement error, and sample selection under the endogeneity
umbrella because they all cause correlation between X and the error but
other fields don't make the same connection. For instance, early panel data
work talked about Mundlak devices as "instruments" in the same way that
dynamic panel data models talk about lags and first differences as
instruments but they aren't traditional instrumental variables that you'd
find in the wild and arguably wouldn't pass the exclusion restriction test
outside of panel data. They call them instruments because they instrument
the endogeneity but they aren't "instrumental variables" in the common
parlance.

It's not clear to me if you are referring to general omitted variable bias
whereby you don't have all the appropriate variables in the model or sample
selection bias a la Heckman whereby the sample under study is
systematically different from the population to which you would like to
make inferences and thus needs some kind of complex propensity to choose A
or B style correction like with the standard selection model. I'm not clear
specifically because you referenced the inverse mills ratio but it *sounds*
like you just think you are possibly missing some set of confounders due to
the lack of randomization. If you do have sample selection bias you can use
a multilevel variant of a heckman selection model with random effects in
the outcome and selection equations. See Grilli, L., & Rampichini, C.
(2010). Selection bias in linear mixed models. *Metron, 68*(3), 309-329 for
the best discussion of the topic that I've read. Most multilevel modeling
work with this kind of problem is based on multilevel propensity score
matching which is a close cousin of multilevel Heckman selection models as
the inverse mills ratio and the propensity score are related.

You're right that the addition of group means per Mundlak segregates the
within and between effects into two different sets of betas when they would
otherwise be a weighted average. It's just a reparamaritization of the
dummy variable version of fixed effects. It is mathematically impossible in
a linear model for a group mean centered multilevel model to return
different within group beta coefficients than the standard FE model. That
doesn't mean that both of them aren't wrong because of cross-level
interactions, measurement error, selection bias and what not but they would
both be wrong in identical ways. You can directly test that they are
identical with a version of a Hausman test comparing the within group betas
with a chi2 test. The degrees of freedom calculation will be off from the
regular test because the between effects add extra but the within effects
will be identical to rounding error so it really won't matter. You can also
just do a Mundlak variation on the test. All panel data econometrics
textbooks outline this and you can justify the modeling strategy that way
regardless of reviewer misconceptions.

If the FE or group mean centered MLM are both wrong and there's some kind
of interactive effect still at work then a random coefficient will likely
show up as mattering for model fit with something like an LR test. If beta
(X_i-Xbar_j) on Y does not vary as a function of group per an LR test or
something fancier like WAIC then it is reasonable (but not infallible)
evidence that you don't have group heterogeneity-related omitted variable
bias which is what economists would typically be concerned about in this
context. You can still have other kinds of bias at work just like with any
other kind of observational model. The random coefficient in this context
is a regularized interactive fixed effect in econ jargon whereby you are
interacting the grouping structure with whatever X you want and getting a
distribution of effects. Fundamentally, it's like saying you have some kind
of conditional relationship between group/person and X and just interacting
them. It's slightly complicated by the fact that empirical bayes shrinkage
exists but if you have balanced panels then it's mostly a non issue.



On Sun, Apr 12, 2020 at 7:34 PM Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu>
wrote:

Hi all -

I have a concern regarding self-selection/omitted variable bias. I have a
longitudinal/repeated measures model, theorizing about a relationship
between treatment/control and effort, represented in nlme syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience), random =
~1|subject

Treatment/control is selected by the subject, it is not randomized, thus
raising endogeneity concerns. My background is applied econ, so as I learn
the mixed model domain, I expected to find the mixed model equivalent of
instrumental variables/inverse Mills ratio, etc. Yet there is surprisingly
(to me) limited material addressing this issue. The best reference material
I found is in fact a thread in this mailing list from October 2016 and the
papers referenced within, leading to Bell, Fairbrother, and Jones (2019).
My first impression is that I should employ a within-between random effects
(REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) +
experience_between + experience_within, random = experience_within +
scale(experience) | subject

If I understand correctly, the intuition is that the addition of a group
mean explanatory variable "breaks out" the variability that would be
associated with an omitted variable / error term. Per Bell et al, "there
can be no correlation between level 1 variables included in the model and
the level 2 random effects...unchanging and/or unmeasured characteristics
of an individual (such as intelligence, ability, etc.) will be controlled
out of the estimate of the within effect."

So, no concern between the subject (level 2) and treatment (level 1) via
REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2
characteristics can cause bias in the estimates of between effects and
effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include subject
to address the IID violation but am interested in population, not subject,
performance.

Bell et al continue, "However, unobserved time-varying characteristics can
still cause biases at level 1 in either an FE or a REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can change
across time within a subject), as a practical/empirical matter, the
treatment is unchanging within the subject - subjects have no reason to
change / would prefer to keep the choice constant. Of 80k records,
treatment switches within a subject occur in about a dozen records.

So, I think I have my solution. However, if a reviewer is not happy with
the with-in / between REWB solution (worried about the level 1 bias), I can
further defend EQ 2 via its random coefficient/slope, if I understand the
Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against self-selection
bias, could someone provide an intuitive explanation as to why? Is the idea
that by allowing slopes to vary, there is no endogeneity problem to solve
as the very structure of the model makes the correlated errors concern
irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et al, the
Mundlak models are not meaningful for repeated measures. Also, it appears
that the brms package appears to support mixed modeling using instrumental
variables, something I am more comfortable with per my background, but
strong instrumental variables are hard to find in the wild!

Thank you! - Kelly


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Sun, Apr 12, 2020 5:45 PM #

Wow, this is the kind of content I come here for.  (It will take me
a while to digest this ...) Thank you!

On Sun, Apr 12, 2020 at 8:36 PM John Poe <jdpoe223 at gmail.com> wrote:

Hi Kelly,

It sounds like you've got correct reasoning on the need for a multilevel
model if your variable of interest is time invariant.

Can you post a link to the thread you're referencing?

A bit of clarity on the flavor(s) of endogeneity that concern you might be
helpful. The omitted variable bias issues solved by group mean centering
and the Mundlak device are mostly from model mis/underspecification whereas
sample selection is a fundamentally different mechanism. Both are common
sources of endogeneity recognized as such in different pockets of econ but
they tend to be seen as fundamentally different (often conceptually
unrelated) problems in other fields. Econ subsumes omitted variables, joint
causation, measurement error, and sample selection under the endogeneity
umbrella because they all cause correlation between X and the error but
other fields don't make the same connection. For instance, early panel data
work talked about Mundlak devices as "instruments" in the same way that
dynamic panel data models talk about lags and first differences as
instruments but they aren't traditional instrumental variables that you'd
find in the wild and arguably wouldn't pass the exclusion restriction test
outside of panel data. They call them instruments because they instrument
the endogeneity but they aren't "instrumental variables" in the common
parlance.

It's not clear to me if you are referring to general omitted variable bias
whereby you don't have all the appropriate variables in the model or sample
selection bias a la Heckman whereby the sample under study is
systematically different from the population to which you would like to
make inferences and thus needs some kind of complex propensity to choose A
or B style correction like with the standard selection model. I'm not clear
specifically because you referenced the inverse mills ratio but it *sounds*
like you just think you are possibly missing some set of confounders due to
the lack of randomization. If you do have sample selection bias you can use
a multilevel variant of a heckman selection model with random effects in
the outcome and selection equations. See Grilli, L., & Rampichini, C.
(2010). Selection bias in linear mixed models. *Metron, 68*(3), 309-329 for
the best discussion of the topic that I've read. Most multilevel modeling
work with this kind of problem is based on multilevel propensity score
matching which is a close cousin of multilevel Heckman selection models as
the inverse mills ratio and the propensity score are related.

You're right that the addition of group means per Mundlak segregates the
within and between effects into two different sets of betas when they would
otherwise be a weighted average. It's just a reparamaritization of the
dummy variable version of fixed effects. It is mathematically impossible in
a linear model for a group mean centered multilevel model to return
different within group beta coefficients than the standard FE model. That
doesn't mean that both of them aren't wrong because of cross-level
interactions, measurement error, selection bias and what not but they would
both be wrong in identical ways. You can directly test that they are
identical with a version of a Hausman test comparing the within group betas
with a chi2 test. The degrees of freedom calculation will be off from the
regular test because the between effects add extra but the within effects
will be identical to rounding error so it really won't matter. You can also
just do a Mundlak variation on the test. All panel data econometrics
textbooks outline this and you can justify the modeling strategy that way
regardless of reviewer misconceptions.

If the FE or group mean centered MLM are both wrong and there's some kind
of interactive effect still at work then a random coefficient will likely
show up as mattering for model fit with something like an LR test. If beta
(X_i-Xbar_j) on Y does not vary as a function of group per an LR test or
something fancier like WAIC then it is reasonable (but not infallible)
evidence that you don't have group heterogeneity-related omitted variable
bias which is what economists would typically be concerned about in this
context. You can still have other kinds of bias at work just like with any
other kind of observational model. The random coefficient in this context
is a regularized interactive fixed effect in econ jargon whereby you are
interacting the grouping structure with whatever X you want and getting a
distribution of effects. Fundamentally, it's like saying you have some kind
of conditional relationship between group/person and X and just interacting
them. It's slightly complicated by the fact that empirical bayes shrinkage
exists but if you have balanced panels then it's mostly a non issue.



On Sun, Apr 12, 2020 at 7:34 PM Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu>
wrote:

Hi all -

I have a concern regarding self-selection/omitted variable bias. I have a
longitudinal/repeated measures model, theorizing about a relationship
between treatment/control and effort, represented in nlme syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience), random =
~1|subject

Treatment/control is selected by the subject, it is not randomized, thus
raising endogeneity concerns. My background is applied econ, so as I learn
the mixed model domain, I expected to find the mixed model equivalent of
instrumental variables/inverse Mills ratio, etc. Yet there is surprisingly
(to me) limited material addressing this issue. The best reference material
I found is in fact a thread in this mailing list from October 2016 and the
papers referenced within, leading to Bell, Fairbrother, and Jones (2019).
My first impression is that I should employ a within-between random effects
(REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) +
experience_between + experience_within, random = experience_within +
scale(experience) | subject

If I understand correctly, the intuition is that the addition of a group
mean explanatory variable "breaks out" the variability that would be
associated with an omitted variable / error term. Per Bell et al, "there
can be no correlation between level 1 variables included in the model and
the level 2 random effects...unchanging and/or unmeasured characteristics
of an individual (such as intelligence, ability, etc.) will be controlled
out of the estimate of the within effect."

So, no concern between the subject (level 2) and treatment (level 1) via
REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2
characteristics can cause bias in the estimates of between effects and
effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include subject
to address the IID violation but am interested in population, not subject,
performance.

Bell et al continue, "However, unobserved time-varying characteristics can
still cause biases at level 1 in either an FE or a REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can change
across time within a subject), as a practical/empirical matter, the
treatment is unchanging within the subject - subjects have no reason to
change / would prefer to keep the choice constant. Of 80k records,
treatment switches within a subject occur in about a dozen records.

So, I think I have my solution. However, if a reviewer is not happy with
the with-in / between REWB solution (worried about the level 1 bias), I can
further defend EQ 2 via its random coefficient/slope, if I understand the
Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against self-selection
bias, could someone provide an intuitive explanation as to why? Is the idea
that by allowing slopes to vary, there is no endogeneity problem to solve
as the very structure of the model makes the correlated errors concern
irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et al, the
Mundlak models are not meaningful for repeated measures. Also, it appears
that the brms package appears to support mixed modeling using instrumental
variables, something I am more comfortable with per my background, but
strong instrumental variables are hard to find in the wild!

Thank you! - Kelly


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Slaughter, Kelly

Sun, Apr 12, 2020 5:56 PM #

Thanks for the extensive reply, John! Before I attempt to absorb it all, let me offer a couple of quick answers to your questions just to be sure the thread does not spiral in multiple directions :)

(1)	The beginning of the thread I reference can be found here: https://hypatia.math.ethz.ch/pipermail/r-sig-mixed-models/2016q4/025147.html

(2)	I am referring to omitted variable bias, sorry for the confusion. My treatment / control is ownership of multiple financial accounts / ownership of single accounts. So perhaps let's say IQ tends to make someone more likely to hold multiple accounts (treatment) AND allows them to expend less effort in researching financial trades (outcome variable), whereas I am theorizing that multiple accounts themselves reduce effort directly.

BTW, Ben, thank you for your extensive support across multiple sites in helping the general public with mixed models in R. I have relied upon an EXTENSIVE number of your answers to mixed model questions when developing my models.

-----Original Message-----
From: Ben Bolker <bbolker at gmail.com> 
Sent: Sunday, April 12, 2020 7:46 PM
To: John Poe <jdpoe223 at gmail.com>
Cc: Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu>; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Controlling for self-selection bias / endogeneity in mixed models

  Wow, this is the kind of content I come here for.  (It will take me a while to digest this ...) Thank you!

On Sun, Apr 12, 2020 at 8:36 PM John Poe <jdpoe223 at gmail.com> wrote:

Hi Kelly,

It sounds like you've got correct reasoning on the need for a
multilevel model if your variable of interest is time invariant.

Can you post a link to the thread you're referencing?

A bit of clarity on the flavor(s) of endogeneity that concern you
might be helpful. The omitted variable bias issues solved by group
mean centering and the Mundlak device are mostly from model
mis/underspecification whereas sample selection is a fundamentally
different mechanism. Both are common sources of endogeneity recognized
as such in different pockets of econ but they tend to be seen as
fundamentally different (often conceptually
unrelated) problems in other fields. Econ subsumes omitted variables,
joint causation, measurement error, and sample selection under the
endogeneity umbrella because they all cause correlation between X and
the error but other fields don't make the same connection. For
instance, early panel data work talked about Mundlak devices as
"instruments" in the same way that dynamic panel data models talk
about lags and first differences as instruments but they aren't
traditional instrumental variables that you'd find in the wild and
arguably wouldn't pass the exclusion restriction test outside of panel
data. They call them instruments because they instrument the
endogeneity but they aren't "instrumental variables" in the common parlance.

It's not clear to me if you are referring to general omitted variable
bias whereby you don't have all the appropriate variables in the model
or sample selection bias a la Heckman whereby the sample under study
is systematically different from the population to which you would
like to make inferences and thus needs some kind of complex propensity
to choose A or B style correction like with the standard selection
model. I'm not clear specifically because you referenced the inverse
mills ratio but it *sounds* like you just think you are possibly
missing some set of confounders due to the lack of randomization. If
you do have sample selection bias you can use a multilevel variant of
a heckman selection model with random effects in the outcome and selection equations. See Grilli, L., & Rampichini, C.
(2010). Selection bias in linear mixed models. *Metron, 68*(3),
309-329 for the best discussion of the topic that I've read. Most
multilevel modeling work with this kind of problem is based on
multilevel propensity score matching which is a close cousin of
multilevel Heckman selection models as the inverse mills ratio and the propensity score are related.

You're right that the addition of group means per Mundlak segregates
the within and between effects into two different sets of betas when
they would otherwise be a weighted average. It's just a
reparamaritization of the dummy variable version of fixed effects. It
is mathematically impossible in a linear model for a group mean
centered multilevel model to return different within group beta
coefficients than the standard FE model. That doesn't mean that both
of them aren't wrong because of cross-level interactions, measurement
error, selection bias and what not but they would both be wrong in
identical ways. You can directly test that they are identical with a
version of a Hausman test comparing the within group betas with a chi2
test. The degrees of freedom calculation will be off from the regular
test because the between effects add extra but the within effects will
be identical to rounding error so it really won't matter. You can also
just do a Mundlak variation on the test. All panel data econometrics
textbooks outline this and you can justify the modeling strategy that way regardless of reviewer misconceptions.

If the FE or group mean centered MLM are both wrong and there's some
kind of interactive effect still at work then a random coefficient
will likely show up as mattering for model fit with something like an
LR test. If beta
(X_i-Xbar_j) on Y does not vary as a function of group per an LR test
or something fancier like WAIC then it is reasonable (but not
infallible) evidence that you don't have group heterogeneity-related
omitted variable bias which is what economists would typically be
concerned about in this context. You can still have other kinds of
bias at work just like with any other kind of observational model. The
random coefficient in this context is a regularized interactive fixed
effect in econ jargon whereby you are interacting the grouping
structure with whatever X you want and getting a distribution of
effects. Fundamentally, it's like saying you have some kind of
conditional relationship between group/person and X and just
interacting them. It's slightly complicated by the fact that empirical bayes shrinkage exists but if you have balanced panels then it's mostly a non issue.

On Sun, Apr 12, 2020 at 7:34 PM Slaughter, Kelly
<KELLY.SLAUGHTER at tcu.edu>
wrote:

Hi all -

I have a concern regarding self-selection/omitted variable bias. I 
have a longitudinal/repeated measures model, theorizing about a 
relationship between treatment/control and effort, represented in nlme syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience), 
random = ~1|subject

Treatment/control is selected by the subject, it is not randomized, 
thus raising endogeneity concerns. My background is applied econ, so 
as I learn the mixed model domain, I expected to find the mixed 
model equivalent of instrumental variables/inverse Mills ratio, etc. 
Yet there is surprisingly (to me) limited material addressing this 
issue. The best reference material I found is in fact a thread in 
this mailing list from October 2016 and the papers referenced within, leading to Bell, Fairbrother, and Jones (2019).
My first impression is that I should employ a within-between random 
effects (REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) + 
experience_between + experience_within, random = experience_within +
scale(experience) | subject

If I understand correctly, the intuition is that the addition of a 
group mean explanatory variable "breaks out" the variability that 
would be associated with an omitted variable / error term. Per Bell 
et al, "there can be no correlation between level 1 variables 
included in the model and the level 2 random effects...unchanging 
and/or unmeasured characteristics of an individual (such as 
intelligence, ability, etc.) will be controlled out of the estimate of the within effect."

So, no concern between the subject (level 2) and treatment (level 1) 
via REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2 
characteristics can cause bias in the estimates of between effects 
and effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include 
subject to address the IID violation but am interested in 
population, not subject, performance.

Bell et al continue, "However, unobserved time-varying 
characteristics can still cause biases at level 1 in either an FE or a REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can 
change across time within a subject), as a practical/empirical 
matter, the treatment is unchanging within the subject - subjects 
have no reason to change / would prefer to keep the choice constant. 
Of 80k records, treatment switches within a subject occur in about a dozen records.

So, I think I have my solution. However, if a reviewer is not happy 
with the with-in / between REWB solution (worried about the level 1 
bias), I can further defend EQ 2 via its random coefficient/slope, 
if I understand the Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against 
self-selection bias, could someone provide an intuitive explanation 
as to why? Is the idea that by allowing slopes to vary, there is no 
endogeneity problem to solve as the very structure of the model 
makes the correlated errors concern irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et 
al, the Mundlak models are not meaningful for repeated measures. 
Also, it appears that the brms package appears to support mixed 
modeling using instrumental variables, something I am more 
comfortable with per my background, but strong instrumental variables are hard to find in the wild!

Thank you! - Kelly


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_ma
ilman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3H
WrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-
HlYI&m=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bm
iLGX2F07zLv-M28Gd-4vDdwHogyk&e=

_______________________________________________
R-sig-mixed-models at r-project.org mailing list 
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mail
man_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3HWrzG
YJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m
=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bmiLGX2F07
zLv-M28Gd-4vDdwHogyk&e=

Poe, John

Sun, Apr 12, 2020 6:21 PM #

Ah, okay I see the problem now. This kind of multilevel causal inference
problem is a bit hard for me to conceptualize. I usually think about them
with DAGs.

I *think* you're going to end up trying to model the selection mechanism
itself via something like propensity score weighting unless you can find a
good natural IV. In this context the propensity score is an artificial
instrumental variable (much like randomization is an instrument). You can
find a good explanation of IPW in Hernan and Robins
https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/ which
includes some detail on longitudinal models though that is geared to time
varying treatments. I think you'll just be focusing on building a
propensity score at the time of the choice since it never changes which
simplifies it down to the first cross-section of data. I'm familiar with 15
or 20ish papers on multilevel propensity score modeling so they are easy to
find. One that you might look at is Arpino, B. and Mealli, F., 2011. The
specification of the propensity score in multilevel observational
studies. *Computational
Statistics & Data Analysis*, *55*(4), pp.1770-1780. Arpino has several
papers on the topic including a statistics in medicine article that's also
pretty good. Causal identification is going to be based on how good the
propensity score is and there's no real way around that. Once you get the
weighted (or matched if you want to go that route) data you can put it in a
regular multilevel model.

It's possible that you could model this with cross-level interactions
between ownership and all the level 1 stuff in the model but that would get
messy. I think the propensity score route is at least more straightforward
to interpret. If you had pre-treatment outcome data of some kind then you
could do something like a synthetic control method but I don't know if
that's feasible with what you've got.

On Sun, Apr 12, 2020 at 8:56 PM Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu>
wrote:

Thanks for the extensive reply, John! Before I attempt to absorb it all,
let me offer a couple of quick answers to your questions just to be sure
the thread does not spiral in multiple directions :)

(1)     The beginning of the thread I reference can be found here:
https://hypatia.math.ethz.ch/pipermail/r-sig-mixed-models/2016q4/025147.html

(2)     I am referring to omitted variable bias, sorry for the confusion.
My treatment / control is ownership of multiple financial accounts /
ownership of single accounts. So perhaps let's say IQ tends to make someone
more likely to hold multiple accounts (treatment) AND allows them to expend
less effort in researching financial trades (outcome variable), whereas I
am theorizing that multiple accounts themselves reduce effort directly.

BTW, Ben, thank you for your extensive support across multiple sites in
helping the general public with mixed models in R. I have relied upon an
EXTENSIVE number of your answers to mixed model questions when developing
my models.

-----Original Message-----
From: Ben Bolker <bbolker at gmail.com>
Sent: Sunday, April 12, 2020 7:46 PM
To: John Poe <jdpoe223 at gmail.com>
Cc: Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu>;
r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Controlling for self-selection bias / endogeneity
in mixed models

  Wow, this is the kind of content I come here for.  (It will take me a
while to digest this ...) Thank you!

On Sun, Apr 12, 2020 at 8:36 PM John Poe <jdpoe223 at gmail.com> wrote:

Hi Kelly,

It sounds like you've got correct reasoning on the need for a
multilevel model if your variable of interest is time invariant.

Can you post a link to the thread you're referencing?

A bit of clarity on the flavor(s) of endogeneity that concern you
might be helpful. The omitted variable bias issues solved by group
mean centering and the Mundlak device are mostly from model
mis/underspecification whereas sample selection is a fundamentally
different mechanism. Both are common sources of endogeneity recognized
as such in different pockets of econ but they tend to be seen as
fundamentally different (often conceptually
unrelated) problems in other fields. Econ subsumes omitted variables,
joint causation, measurement error, and sample selection under the
endogeneity umbrella because they all cause correlation between X and
the error but other fields don't make the same connection. For
instance, early panel data work talked about Mundlak devices as
"instruments" in the same way that dynamic panel data models talk
about lags and first differences as instruments but they aren't
traditional instrumental variables that you'd find in the wild and
arguably wouldn't pass the exclusion restriction test outside of panel
data. They call them instruments because they instrument the
endogeneity but they aren't "instrumental variables" in the common

parlance.

It's not clear to me if you are referring to general omitted variable
bias whereby you don't have all the appropriate variables in the model
or sample selection bias a la Heckman whereby the sample under study
is systematically different from the population to which you would
like to make inferences and thus needs some kind of complex propensity
to choose A or B style correction like with the standard selection
model. I'm not clear specifically because you referenced the inverse
mills ratio but it *sounds* like you just think you are possibly
missing some set of confounders due to the lack of randomization. If
you do have sample selection bias you can use a multilevel variant of
a heckman selection model with random effects in the outcome and

selection equations. See Grilli, L., & Rampichini, C.

(2010). Selection bias in linear mixed models. *Metron, 68*(3),
309-329 for the best discussion of the topic that I've read. Most
multilevel modeling work with this kind of problem is based on
multilevel propensity score matching which is a close cousin of
multilevel Heckman selection models as the inverse mills ratio and the

propensity score are related.

You're right that the addition of group means per Mundlak segregates
the within and between effects into two different sets of betas when
they would otherwise be a weighted average. It's just a
reparamaritization of the dummy variable version of fixed effects. It
is mathematically impossible in a linear model for a group mean
centered multilevel model to return different within group beta
coefficients than the standard FE model. That doesn't mean that both
of them aren't wrong because of cross-level interactions, measurement
error, selection bias and what not but they would both be wrong in
identical ways. You can directly test that they are identical with a
version of a Hausman test comparing the within group betas with a chi2
test. The degrees of freedom calculation will be off from the regular
test because the between effects add extra but the within effects will
be identical to rounding error so it really won't matter. You can also
just do a Mundlak variation on the test. All panel data econometrics
textbooks outline this and you can justify the modeling strategy that

way regardless of reviewer misconceptions.

If the FE or group mean centered MLM are both wrong and there's some
kind of interactive effect still at work then a random coefficient
will likely show up as mattering for model fit with something like an
LR test. If beta
(X_i-Xbar_j) on Y does not vary as a function of group per an LR test
or something fancier like WAIC then it is reasonable (but not
infallible) evidence that you don't have group heterogeneity-related
omitted variable bias which is what economists would typically be
concerned about in this context. You can still have other kinds of
bias at work just like with any other kind of observational model. The
random coefficient in this context is a regularized interactive fixed
effect in econ jargon whereby you are interacting the grouping
structure with whatever X you want and getting a distribution of
effects. Fundamentally, it's like saying you have some kind of
conditional relationship between group/person and X and just
interacting them. It's slightly complicated by the fact that empirical

bayes shrinkage exists but if you have balanced panels then it's mostly a
non issue.



On Sun, Apr 12, 2020 at 7:34 PM Slaughter, Kelly
<KELLY.SLAUGHTER at tcu.edu>
wrote:

Hi all -

I have a concern regarding self-selection/omitted variable bias. I
have a longitudinal/repeated measures model, theorizing about a
relationship between treatment/control and effort, represented in nlme

syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience),
random = ~1|subject

Treatment/control is selected by the subject, it is not randomized,
thus raising endogeneity concerns. My background is applied econ, so
as I learn the mixed model domain, I expected to find the mixed
model equivalent of instrumental variables/inverse Mills ratio, etc.
Yet there is surprisingly (to me) limited material addressing this
issue. The best reference material I found is in fact a thread in
this mailing list from October 2016 and the papers referenced within,

leading to Bell, Fairbrother, and Jones (2019).

My first impression is that I should employ a within-between random
effects (REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) +
experience_between + experience_within, random = experience_within +
scale(experience) | subject

If I understand correctly, the intuition is that the addition of a
group mean explanatory variable "breaks out" the variability that
would be associated with an omitted variable / error term. Per Bell
et al, "there can be no correlation between level 1 variables
included in the model and the level 2 random effects...unchanging
and/or unmeasured characteristics of an individual (such as
intelligence, ability, etc.) will be controlled out of the estimate of

the within effect."

So, no concern between the subject (level 2) and treatment (level 1)
via REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2
characteristics can cause bias in the estimates of between effects
and effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include
subject to address the IID violation but am interested in
population, not subject, performance.

Bell et al continue, "However, unobserved time-varying
characteristics can still cause biases at level 1 in either an FE or a

REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can
change across time within a subject), as a practical/empirical
matter, the treatment is unchanging within the subject - subjects
have no reason to change / would prefer to keep the choice constant.
Of 80k records, treatment switches within a subject occur in about a

dozen records.

So, I think I have my solution. However, if a reviewer is not happy
with the with-in / between REWB solution (worried about the level 1
bias), I can further defend EQ 2 via its random coefficient/slope,
if I understand the Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against
self-selection bias, could someone provide an intuitive explanation
as to why? Is the idea that by allowing slopes to vary, there is no
endogeneity problem to solve as the very structure of the model
makes the correlated errors concern irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et
al, the Mundlak models are not meaningful for repeated measures.
Also, it appears that the brms package appears to support mixed
modeling using instrumental variables, something I am more
comfortable with per my background, but strong instrumental variables

are hard to find in the wild!

Thank you! - Kelly


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_ma
ilman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3H
WrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-
HlYI&m=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bm
iLGX2F07zLv-M28Gd-4vDdwHogyk&e=

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mail
man_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3HWrzG
YJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m
=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bmiLGX2F07
zLv-M28Gd-4vDdwHogyk&e=

Slaughter, Kelly

Mon, Apr 13, 2020 4:35 AM #

Thanks yet again, John. I actually began my ?journey? with propensity score matching, using the MatchIt package. Then the authors of the package came out against propensity score matching (http://gking.harvard.edu/files/gking/files/psnot.pdf) so I turned to Coarsened Exact Matching (CEM). Further evaluation revealed what I think is a sound argument that matching is only effective if you match using the separating / omitted variable (see chrisblattman.com/2010/10/27/the-cardinal-sin-of-matching/ and projects.iq.harvard.edu/sss_blog/can_matching_so). But, if you have the missing variable, you have no need to match! In short, the argument is that while matching provides a benefit over regression wrt regression extrapolation (e.g., the control variable and treatment variables? related outcome values have little overlap), it is not a solution for addressing endogeneity. But I am quite open to returning to matching if I misunderstood the argument.

You wrote in your original reply, ??a random coefficient  will likely show up as mattering for model fit with something like an  LR test.? An anova test of models with/without a random slope did indicate a better fit with the random slope. Per a response of yours in the 2016 thread,  ?The typical response when this test shows that there is still a violation of the no correlation between a random effect and a level 1 variable assumption is to stop making that assumption and use a random coefficients model.? So in my case, random (subject) and level 1 (treatment, or perhaps the missing IQ) ? what remains to be solved?

Requoting Bell ??unchanging and/or unmeasured characteristics of an individual (such as intelligence, ability, etc.) will be controlled out of the estimate of the within effect.?  This seems to address my main concern - an omitted variable (e.g., IQ) not orthogonal with treatment and correlated with the outcome. Are you not convinced that the ?within solution? in fact solves this? Or perhaps it addresses a different problem and I am not thinking about my problem correctly?

Thanks again ? I don?t want to be lazy and ask you to think through issues I should be thinking through, but discussing this with someone more familiar with the issues and a deeper understanding of the underlying statistics is a huge help!

FYI, for anyone following this thread, there is a helpful implementation of Bell et al in R to be found at https://strengejacke.github.io/mixed-models-snippets/random-effects-within-between-effects-model.html


From: John Poe <jdpoe223 at gmail.com>
Sent: Sunday, April 12, 2020 8:22 PM
To: Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu>
Cc: Ben Bolker <bbolker at gmail.com>; r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Controlling for self-selection bias / endogeneity in mixed models

Ah, okay I see the problem now. This kind of multilevel causal inference problem is a bit hard for me to conceptualize. I usually think about them with DAGs.

I *think* you're going to end up trying to model the selection mechanism itself via something like propensity score weighting unless you can find a good natural IV. In this context the propensity score is an artificial instrumental variable (much like randomization is an instrument). You can find a good explanation of IPW in Hernan and Robins https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.hsph.harvard.edu_miguel-2Dhernan_causal-2Dinference-2Dbook_&d=DwMFaQ&c=7Q-FWLBTAxn3T_E3HWrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m=w8nTOzq9PBY5WYHICKLJZ9zkWubZcwWebZPmZfsF9Oc&s=7nKaj3-u-912u_MjyXCT3jUs8dLY2q6kbYAy2vvk1as&e=> which includes some detail on longitudinal models though that is geared to time varying treatments. I think you'll just be focusing on building a propensity score at the time of the choice since it never changes which simplifies it down to the first cross-section of data. I'm familiar with 15 or 20ish papers on multilevel propensity score modeling so they are easy to find. One that you might look at is Arpino, B. and Mealli, F., 2011. The specification of the propensity score in multilevel observational studies. Computational Statistics & Data Analysis, 55(4), pp.1770-1780. Arpino has several papers on the topic including a statistics in medicine article that's also pretty good. Causal identification is going to be based on how good the propensity score is and there's no real way around that. Once you get the weighted (or matched if you want to go that route) data you can put it in a regular multilevel model.

It's possible that you could model this with cross-level interactions between ownership and all the level 1 stuff in the model but that would get messy. I think the propensity score route is at least more straightforward to interpret. If you had pre-treatment outcome data of some kind then you could do something like a synthetic control method but I don't know if that's feasible with what you've got.

On Sun, Apr 12, 2020 at 8:56 PM Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu>> wrote:

Thanks for the extensive reply, John! Before I attempt to absorb it all, let me offer a couple of quick answers to your questions just to be sure the thread does not spiral in multiple directions :)

(1)     The beginning of the thread I reference can be found here: https://hypatia.math.ethz.ch/pipermail/r-sig-mixed-models/2016q4/025147.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__hypatia.math.ethz.ch_pipermail_r-2Dsig-2Dmixed-2Dmodels_2016q4_025147.html&d=DwMFaQ&c=7Q-FWLBTAxn3T_E3HWrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m=w8nTOzq9PBY5WYHICKLJZ9zkWubZcwWebZPmZfsF9Oc&s=ko7SDSV6QyHTTxwz0WlGtzSAT0DkpUH6s9xQHipJviI&e=>

(2)     I am referring to omitted variable bias, sorry for the confusion. My treatment / control is ownership of multiple financial accounts / ownership of single accounts. So perhaps let's say IQ tends to make someone more likely to hold multiple accounts (treatment) AND allows them to expend less effort in researching financial trades (outcome variable), whereas I am theorizing that multiple accounts themselves reduce effort directly.

BTW, Ben, thank you for your extensive support across multiple sites in helping the general public with mixed models in R. I have relied upon an EXTENSIVE number of your answers to mixed model questions when developing my models.

-----Original Message-----
From: Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com>>
Sent: Sunday, April 12, 2020 7:46 PM
To: John Poe <jdpoe223 at gmail.com<mailto:jdpoe223 at gmail.com>>
Cc: Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu>>; r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Controlling for self-selection bias / endogeneity in mixed models

  Wow, this is the kind of content I come here for.  (It will take me a while to digest this ...) Thank you!

On Sun, Apr 12, 2020 at 8:36 PM John Poe <jdpoe223 at gmail.com<mailto:jdpoe223 at gmail.com>> wrote:

Hi Kelly,

It sounds like you've got correct reasoning on the need for a
multilevel model if your variable of interest is time invariant.

Can you post a link to the thread you're referencing?

A bit of clarity on the flavor(s) of endogeneity that concern you
might be helpful. The omitted variable bias issues solved by group
mean centering and the Mundlak device are mostly from model
mis/underspecification whereas sample selection is a fundamentally
different mechanism. Both are common sources of endogeneity recognized
as such in different pockets of econ but they tend to be seen as
fundamentally different (often conceptually
unrelated) problems in other fields. Econ subsumes omitted variables,
joint causation, measurement error, and sample selection under the
endogeneity umbrella because they all cause correlation between X and
the error but other fields don't make the same connection. For
instance, early panel data work talked about Mundlak devices as
"instruments" in the same way that dynamic panel data models talk
about lags and first differences as instruments but they aren't
traditional instrumental variables that you'd find in the wild and
arguably wouldn't pass the exclusion restriction test outside of panel
data. They call them instruments because they instrument the
endogeneity but they aren't "instrumental variables" in the common parlance.

It's not clear to me if you are referring to general omitted variable
bias whereby you don't have all the appropriate variables in the model
or sample selection bias a la Heckman whereby the sample under study
is systematically different from the population to which you would
like to make inferences and thus needs some kind of complex propensity
to choose A or B style correction like with the standard selection
model. I'm not clear specifically because you referenced the inverse
mills ratio but it *sounds* like you just think you are possibly
missing some set of confounders due to the lack of randomization. If
you do have sample selection bias you can use a multilevel variant of
a heckman selection model with random effects in the outcome and selection equations. See Grilli, L., & Rampichini, C.
(2010). Selection bias in linear mixed models. *Metron, 68*(3),
309-329 for the best discussion of the topic that I've read. Most
multilevel modeling work with this kind of problem is based on
multilevel propensity score matching which is a close cousin of
multilevel Heckman selection models as the inverse mills ratio and the propensity score are related.

You're right that the addition of group means per Mundlak segregates
the within and between effects into two different sets of betas when
they would otherwise be a weighted average. It's just a
reparamaritization of the dummy variable version of fixed effects. It
is mathematically impossible in a linear model for a group mean
centered multilevel model to return different within group beta
coefficients than the standard FE model. That doesn't mean that both
of them aren't wrong because of cross-level interactions, measurement
error, selection bias and what not but they would both be wrong in
identical ways. You can directly test that they are identical with a
version of a Hausman test comparing the within group betas with a chi2
test. The degrees of freedom calculation will be off from the regular
test because the between effects add extra but the within effects will
be identical to rounding error so it really won't matter. You can also
just do a Mundlak variation on the test. All panel data econometrics
textbooks outline this and you can justify the modeling strategy that way regardless of reviewer misconceptions.

If the FE or group mean centered MLM are both wrong and there's some
kind of interactive effect still at work then a random coefficient
will likely show up as mattering for model fit with something like an
LR test. If beta
(X_i-Xbar_j) on Y does not vary as a function of group per an LR test
or something fancier like WAIC then it is reasonable (but not
infallible) evidence that you don't have group heterogeneity-related
omitted variable bias which is what economists would typically be
concerned about in this context. You can still have other kinds of
bias at work just like with any other kind of observational model. The
random coefficient in this context is a regularized interactive fixed
effect in econ jargon whereby you are interacting the grouping
structure with whatever X you want and getting a distribution of
effects. Fundamentally, it's like saying you have some kind of
conditional relationship between group/person and X and just
interacting them. It's slightly complicated by the fact that empirical bayes shrinkage exists but if you have balanced panels then it's mostly a non issue.

On Sun, Apr 12, 2020 at 7:34 PM Slaughter, Kelly
<KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu>>
wrote:

Hi all -

I have a concern regarding self-selection/omitted variable bias. I
have a longitudinal/repeated measures model, theorizing about a
relationship between treatment/control and effort, represented in nlme syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience),
random = ~1|subject

Treatment/control is selected by the subject, it is not randomized,
thus raising endogeneity concerns. My background is applied econ, so
as I learn the mixed model domain, I expected to find the mixed
model equivalent of instrumental variables/inverse Mills ratio, etc.
Yet there is surprisingly (to me) limited material addressing this
issue. The best reference material I found is in fact a thread in
this mailing list from October 2016 and the papers referenced within, leading to Bell, Fairbrother, and Jones (2019).
My first impression is that I should employ a within-between random
effects (REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) +
experience_between + experience_within, random = experience_within +
scale(experience) | subject

If I understand correctly, the intuition is that the addition of a
group mean explanatory variable "breaks out" the variability that
would be associated with an omitted variable / error term. Per Bell
et al, "there can be no correlation between level 1 variables
included in the model and the level 2 random effects...unchanging
and/or unmeasured characteristics of an individual (such as
intelligence, ability, etc.) will be controlled out of the estimate of the within effect."

So, no concern between the subject (level 2) and treatment (level 1)
via REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2
characteristics can cause bias in the estimates of between effects
and effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include
subject to address the IID violation but am interested in
population, not subject, performance.

Bell et al continue, "However, unobserved time-varying
characteristics can still cause biases at level 1 in either an FE or a REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can
change across time within a subject), as a practical/empirical
matter, the treatment is unchanging within the subject - subjects
have no reason to change / would prefer to keep the choice constant.
Of 80k records, treatment switches within a subject occur in about a dozen records.

So, I think I have my solution. However, if a reviewer is not happy
with the with-in / between REWB solution (worried about the level 1
bias), I can further defend EQ 2 via its random coefficient/slope,
if I understand the Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against
self-selection bias, could someone provide an intuitive explanation
as to why? Is the idea that by allowing slopes to vary, there is no
endogeneity problem to solve as the very structure of the model
makes the correlated errors concern irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et
al, the Mundlak models are not meaningful for repeated measures.
Also, it appears that the brms package appears to support mixed
modeling using instrumental variables, something I am more
comfortable with per my background, but strong instrumental variables are hard to find in the wild!

Thank you! - Kelly


        [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_ma
ilman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3H
WrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-
HlYI&m=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bm
iLGX2F07zLv-M28Gd-4vDdwHogyk&e=

_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mail
man_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3HWrzG
YJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m
=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bmiLGX2F07
zLv-M28Gd-4vDdwHogyk&e=

John Maindonald

Mon, Apr 13, 2020 2:46 PM #

Irrespective of the usefulness of propensity scores as a substitute for incorporation of
the relevant variables into the regression, plots that show the extent to which groups can
be separated on the basis of propensity can provide highly useful insight.  If the groups
can be mostly or largely separated on the basis of the propensity scores, that raises
large issues for the reliance that one can place on either the regression model or the
propensity score model.  Extreme (and less extreme?) outliers on the scores can be
removed.  Of course, propensity scores should themselves be subject to diagnostic
checks; should one or more variables be transformed, or is it impossible to say?  These
issues become a whole lot more difficult in a multi-level model setting.

A serious weakness of Morgan and Winship?s ?Counterfactuals and Causal Inference?
(2015) is that, although it has lots of DAGs, it is almost completely free of graphs that
might be used for diagnostic purposes.  Multi-level models are, from a quick check, not
discussed.  What advance, if any, is now available on Morgan and Winship?


John Maindonald             email: john.maindonald at anu.edu.au<mailto:john.maindonald at anu.edu.au>

On 13/04/2020, at 23:35, Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu>> wrote:

Thanks yet again, John. I actually began my ?journey? with propensity score matching, using the MatchIt package. Then the authors of the package came out against propensity score matching (http://gking.harvard.edu/files/gking/files/psnot.pdf) so I turned to Coarsened Exact Matching (CEM). Further evaluation revealed what I think is a sound argument that matching is only effective if you match using the separating / omitted variable (see chrisblattman.com/2010/10/27/the-cardinal-sin-of-matching/<http://chrisblattman.com/2010/10/27/the-cardinal-sin-of-matching/> and projects.iq.harvard.edu/sss_blog/can_matching_so<http://projects.iq.harvard.edu/sss_blog/can_matching_so>). But, if you have the missing variable, you have no need to match! In short, the argument is that while matching provides a benefit over regression wrt regression extrapolation (e.g., the control variable and treatment variables? related outcome values have little overlap), it is not a solution for addressing endogeneity. But I am quite open to returning to matching if I misunderstood the argument.

You wrote in your original reply, ??a random coefficient  will likely show up as mattering for model fit with something like an  LR test.? An anova test of models with/without a random slope did indicate a better fit with the random slope. Per a response of yours in the 2016 thread,  ?The typical response when this test shows that there is still a violation of the no correlation between a random effect and a level 1 variable assumption is to stop making that assumption and use a random coefficients model.? So in my case, random (subject) and level 1 (treatment, or perhaps the missing IQ) ? what remains to be solved?

Requoting Bell ??unchanging and/or unmeasured characteristics of an individual (such as intelligence, ability, etc.) will be controlled out of the estimate of the within effect.?  This seems to address my main concern - an omitted variable (e.g., IQ) not orthogonal with treatment and correlated with the outcome. Are you not convinced that the ?within solution? in fact solves this? Or perhaps it addresses a different problem and I am not thinking about my problem correctly?

Thanks again ? I don?t want to be lazy and ask you to think through issues I should be thinking through, but discussing this with someone more familiar with the issues and a deeper understanding of the underlying statistics is a huge help!

FYI, for anyone following this thread, there is a helpful implementation of Bell et al in R to be found at https://strengejacke.github.io/mixed-models-snippets/random-effects-within-between-effects-model.html


From: John Poe <jdpoe223 at gmail.com<mailto:jdpoe223 at gmail.com>>
Sent: Sunday, April 12, 2020 8:22 PM
To: Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu>>
Cc: Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com>>; r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Controlling for self-selection bias / endogeneity in mixed models

Ah, okay I see the problem now. This kind of multilevel causal inference problem is a bit hard for me to conceptualize. I usually think about them with DAGs.

I *think* you're going to end up trying to model the selection mechanism itself via something like propensity score weighting unless you can find a good natural IV. In this context the propensity score is an artificial instrumental variable (much like randomization is an instrument). You can find a good explanation of IPW in Hernan and Robins https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.hsph.harvard.edu_miguel-2Dhernan_causal-2Dinference-2Dbook_&d=DwMFaQ&c=7Q-FWLBTAxn3T_E3HWrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m=w8nTOzq9PBY5WYHICKLJZ9zkWubZcwWebZPmZfsF9Oc&s=7nKaj3-u-912u_MjyXCT3jUs8dLY2q6kbYAy2vvk1as&e=> which includes some detail on longitudinal models though that is geared to time varying treatments. I think you'll just be focusing on building a propensity score at the time of the choice since it never changes which simplifies it down to the first cross-section of data. I'm familiar with 15 or 20ish papers on multilevel propensity score modeling so they are easy to find. One that you might look at is Arpino, B. and Mealli, F., 2011. The specification of the propensity score in multilevel observational studies. Computational Statistics & Data Analysis, 55(4), pp.1770-1780. Arpino has several papers on the topic including a statistics in medicine article that's also pretty good. Causal identification is going to be based on how good the propensity score is and there's no real way around that. Once you get the weighted (or matched if you want to go that route) data you can put it in a regular multilevel model.

It's possible that you could model this with cross-level interactions between ownership and all the level 1 stuff in the model but that would get messy. I think the propensity score route is at least more straightforward to interpret. If you had pre-treatment outcome data of some kind then you could do something like a synthetic control method but I don't know if that's feasible with what you've got.

On Sun, Apr 12, 2020 at 8:56 PM Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu><mailto:KELLY.SLAUGHTER at tcu.edu>> wrote:

Thanks for the extensive reply, John! Before I attempt to absorb it all, let me offer a couple of quick answers to your questions just to be sure the thread does not spiral in multiple directions :)

(1)     The beginning of the thread I reference can be found here: https://hypatia.math.ethz.ch/pipermail/r-sig-mixed-models/2016q4/025147.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__hypatia.math.ethz.ch_pipermail_r-2Dsig-2Dmixed-2Dmodels_2016q4_025147.html&d=DwMFaQ&c=7Q-FWLBTAxn3T_E3HWrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m=w8nTOzq9PBY5WYHICKLJZ9zkWubZcwWebZPmZfsF9Oc&s=ko7SDSV6QyHTTxwz0WlGtzSAT0DkpUH6s9xQHipJviI&e=>

(2)     I am referring to omitted variable bias, sorry for the confusion. My treatment / control is ownership of multiple financial accounts / ownership of single accounts. So perhaps let's say IQ tends to make someone more likely to hold multiple accounts (treatment) AND allows them to expend less effort in researching financial trades (outcome variable), whereas I am theorizing that multiple accounts themselves reduce effort directly.

BTW, Ben, thank you for your extensive support across multiple sites in helping the general public with mixed models in R. I have relied upon an EXTENSIVE number of your answers to mixed model questions when developing my models.

-----Original Message-----
From: Ben Bolker <bbolker at gmail.com<mailto:bbolker at gmail.com><mailto:bbolker at gmail.com>>
Sent: Sunday, April 12, 2020 7:46 PM
To: John Poe <jdpoe223 at gmail.com<mailto:jdpoe223 at gmail.com><mailto:jdpoe223 at gmail.com>>
Cc: Slaughter, Kelly <KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu><mailto:KELLY.SLAUGHTER at tcu.edu>>; r-sig-mixed-models at r-project.org<mailto:r-sig-mixed-models at r-project.org><mailto:r-sig-mixed-models at r-project.org>
Subject: Re: [R-sig-ME] Controlling for self-selection bias / endogeneity in mixed models

 Wow, this is the kind of content I come here for.  (It will take me a while to digest this ...) Thank you!

On Sun, Apr 12, 2020 at 8:36 PM John Poe <jdpoe223 at gmail.com<mailto:jdpoe223 at gmail.com><mailto:jdpoe223 at gmail.com>> wrote:

Hi Kelly,

It sounds like you've got correct reasoning on the need for a
multilevel model if your variable of interest is time invariant.

Can you post a link to the thread you're referencing?

A bit of clarity on the flavor(s) of endogeneity that concern you
might be helpful. The omitted variable bias issues solved by group
mean centering and the Mundlak device are mostly from model
mis/underspecification whereas sample selection is a fundamentally
different mechanism. Both are common sources of endogeneity recognized
as such in different pockets of econ but they tend to be seen as
fundamentally different (often conceptually
unrelated) problems in other fields. Econ subsumes omitted variables,
joint causation, measurement error, and sample selection under the
endogeneity umbrella because they all cause correlation between X and
the error but other fields don't make the same connection. For
instance, early panel data work talked about Mundlak devices as
"instruments" in the same way that dynamic panel data models talk
about lags and first differences as instruments but they aren't
traditional instrumental variables that you'd find in the wild and
arguably wouldn't pass the exclusion restriction test outside of panel
data. They call them instruments because they instrument the
endogeneity but they aren't "instrumental variables" in the common parlance.

It's not clear to me if you are referring to general omitted variable
bias whereby you don't have all the appropriate variables in the model
or sample selection bias a la Heckman whereby the sample under study
is systematically different from the population to which you would
like to make inferences and thus needs some kind of complex propensity
to choose A or B style correction like with the standard selection
model. I'm not clear specifically because you referenced the inverse
mills ratio but it *sounds* like you just think you are possibly
missing some set of confounders due to the lack of randomization. If
you do have sample selection bias you can use a multilevel variant of
a heckman selection model with random effects in the outcome and selection equations. See Grilli, L., & Rampichini, C.
(2010). Selection bias in linear mixed models. *Metron, 68*(3),
309-329 for the best discussion of the topic that I've read. Most
multilevel modeling work with this kind of problem is based on
multilevel propensity score matching which is a close cousin of
multilevel Heckman selection models as the inverse mills ratio and the propensity score are related.

You're right that the addition of group means per Mundlak segregates
the within and between effects into two different sets of betas when
they would otherwise be a weighted average. It's just a
reparamaritization of the dummy variable version of fixed effects. It
is mathematically impossible in a linear model for a group mean
centered multilevel model to return different within group beta
coefficients than the standard FE model. That doesn't mean that both
of them aren't wrong because of cross-level interactions, measurement
error, selection bias and what not but they would both be wrong in
identical ways. You can directly test that they are identical with a
version of a Hausman test comparing the within group betas with a chi2
test. The degrees of freedom calculation will be off from the regular
test because the between effects add extra but the within effects will
be identical to rounding error so it really won't matter. You can also
just do a Mundlak variation on the test. All panel data econometrics
textbooks outline this and you can justify the modeling strategy that way regardless of reviewer misconceptions.

If the FE or group mean centered MLM are both wrong and there's some
kind of interactive effect still at work then a random coefficient
will likely show up as mattering for model fit with something like an
LR test. If beta
(X_i-Xbar_j) on Y does not vary as a function of group per an LR test
or something fancier like WAIC then it is reasonable (but not
infallible) evidence that you don't have group heterogeneity-related
omitted variable bias which is what economists would typically be
concerned about in this context. You can still have other kinds of
bias at work just like with any other kind of observational model. The
random coefficient in this context is a regularized interactive fixed
effect in econ jargon whereby you are interacting the grouping
structure with whatever X you want and getting a distribution of
effects. Fundamentally, it's like saying you have some kind of
conditional relationship between group/person and X and just
interacting them. It's slightly complicated by the fact that empirical bayes shrinkage exists but if you have balanced panels then it's mostly a non issue.



On Sun, Apr 12, 2020 at 7:34 PM Slaughter, Kelly
<KELLY.SLAUGHTER at tcu.edu<mailto:KELLY.SLAUGHTER at tcu.edu><mailto:KELLY.SLAUGHTER at tcu.edu>>
wrote:

Hi all -

I have a concern regarding self-selection/omitted variable bias. I
have a longitudinal/repeated measures model, theorizing about a
relationship between treatment/control and effort, represented in nlme syntax as:

EQ 1) log(effort measured in time) ~ treatment*scale(experience),
random = ~1|subject

Treatment/control is selected by the subject, it is not randomized,
thus raising endogeneity concerns. My background is applied econ, so
as I learn the mixed model domain, I expected to find the mixed
model equivalent of instrumental variables/inverse Mills ratio, etc.
Yet there is surprisingly (to me) limited material addressing this
issue. The best reference material I found is in fact a thread in
this mailing list from October 2016 and the papers referenced within, leading to Bell, Fairbrother, and Jones (2019).
My first impression is that I should employ a within-between random
effects (REWB)model -

EQ 2) log(effort measured in time) ~ treatment*scale(experience) +
experience_between + experience_within, random = experience_within +
scale(experience) | subject

If I understand correctly, the intuition is that the addition of a
group mean explanatory variable "breaks out" the variability that
would be associated with an omitted variable / error term. Per Bell
et al, "there can be no correlation between level 1 variables
included in the model and the level 2 random effects...unchanging
and/or unmeasured characteristics of an individual (such as
intelligence, ability, etc.) will be controlled out of the estimate of the within effect."

So, no concern between the subject (level 2) and treatment (level 1)
via REWB, wonderful!

Bell et al caution, "...in a REWB/Mundlak models, unmeasured level 2
characteristics can cause bias in the estimates of between effects
and effects of other level 2 variables."

Not an issue for me - I am not concerned with level 2, I include
subject to address the IID violation but am interested in
population, not subject, performance.

Bell et al continue, "However, unobserved time-varying
characteristics can still cause biases at level 1 in either an FE or a REWB/Mundlak model."

Though conceptually my treatment variable is time-varying (it can
change across time within a subject), as a practical/empirical
matter, the treatment is unchanging within the subject - subjects
have no reason to change / would prefer to keep the choice constant.
Of 80k records, treatment switches within a subject occur in about a dozen records.

So, I think I have my solution. However, if a reviewer is not happy
with the with-in / between REWB solution (worried about the level 1
bias), I can further defend EQ 2 via its random coefficient/slope,
if I understand the Oct 2016 thread correctly.

So, my questions are:

(1) Is the above correctly reasoned?

(2) If the random slope model is a further defense against
self-selection bias, could someone provide an intuitive explanation
as to why? Is the idea that by allowing slopes to vary, there is no
endogeneity problem to solve as the very structure of the model
makes the correlated errors concern irrelevant?

Other solutions I explore include a Mundlak model, but per Bell et
al, the Mundlak models are not meaningful for repeated measures.
Also, it appears that the brms package appears to support mixed
modeling using instrumental variables, something I am more
comfortable with per my background, but strong instrumental variables are hard to find in the wild!

Thank you! - Kelly



_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org><mailto:R-sig-mixed-models at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_ma
ilman_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3H
WrzGYJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-
HlYI&m=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bm
iLGX2F07zLv-M28Gd-4vDdwHogyk&e=



_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org><mailto:R-sig-mixed-models at r-project.org> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mail
man_listinfo_r-2Dsig-2Dmixed-2Dmodels&d=DwIBaQ&c=7Q-FWLBTAxn3T_E3HWrzG
YJrC4RvUoWDrzTlitGRH_A&r=t-hV_EQcvMxUUCFqXmGPFL3N6XmAH6-xWI5Xpn-HlYI&m
=QIwJJAou0NQyfk892Wz-BodAH5I2A4aX08LX_ruukNk&s=4wSiK6P7-7_81bmiLGX2F07
zLv-M28Gd-4vDdwHogyk&e=


_______________________________________________
R-sig-mixed-models at r-project.org<mailto:R-sig-mixed-models at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models