Skip to content

Is it kosher to use random-intercept estimates as explanatory variables in another model?

4 messages · Jeremy Koster, Reinhold Kliegl, Gebregziabher, Mulugeta +1 more

#
I'm reviewing a paper for a colleague, and I haven't seen this done before.

Imagine that she has a sample of 100 houses, all of which include children who raise chickens.  She includes a random term for household and finds that there is substantial household-level variance in chicken husbandry by kids.

She then takes the household-level estimates (i.e., plus/minus relative to the model intercept) and uses them as an explanatory variable in an OLS model with households as the sampling unit.  For example, she would predict something like household-level income while using the random-intercept estimates from the chicken analysis (and other covariates).

At first glance, this might seem relatively straightforward, but I haven't encountered similar analyses, and I'm wondering about potential pitfalls . . . particularly given the variable number of kids in each house.

Any thoughts?

Thanks!
#
The random effects are not independent "observations"; the amount of
shrinkage depends on the model parameters which are estimated from all
the data. So unless there is no shrinkage associated with the random
effects this is not a good idea. It may be better to to think about
including the other variables (plus suitable interaction terms) in the
first model. Alternatively, a structural equation model may be a
better path to pursue.

Reinhold Kliegl
On Mon, Jun 6, 2011 at 7:55 PM, Jeremy Koster <helixed2 at yahoo.com> wrote:
#
This looks like the two stage joint modeling approach used in Ye et al (2008) or Gebregziabher et al (2010). The key, I think, is to use some kind of robust variance for the coefficients of the first stage estimated values that are used as covariates in the second stage.

References
Ye W, Lin X, Taylor JMG. Semiparametric modeling of longitudinal measurements and time-to-event data - A two-stage regression calibration approach. Biometrics 2008;64(4):1238-1246.
Gebregziabher M Egede LE, et al (2010) Effect of Trajectories of glycemic control on mortality in type 2 diabetes: A semiparametric joint modeling approach.  Am J Epidemiol. 2010 May 15;171(10):1090-8. Epub 2010 Apr 27

Hope this helps.
Mulugeta


-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Jeremy Koster
Sent: Monday, June 06, 2011 1:55 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Is it kosher to use random-intercept estimates as explanatory variables in another model?

I'm reviewing a paper for a colleague, and I haven't seen this done before.

Imagine that she has a sample of 100 houses, all of which include children who raise chickens.  She includes a random term for household and finds that there is substantial household-level variance in chicken husbandry by kids.

She then takes the household-level estimates (i.e., plus/minus relative to the model intercept) and uses them as an explanatory variable in an OLS model with households as the sampling unit.  For example, she would predict something like household-level income while using the random-intercept estimates from the chicken analysis (and other covariates).

At first glance, this might seem relatively straightforward, but I haven't encountered similar analyses, and I'm wondering about potential pitfalls . . . particularly given the variable number of kids in each house.

Any thoughts?

Thanks!

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
#
She could do this in a single step using a multilevel model that includes
group-level predictors to model part of the variation associated with the
intercept.

Gelman & Hill include a nice example and discussion of the effect that
group-level predictors have on the estimates of observation-level parameters
in Section 12.6 of their book.

-Christos  

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Jeremy Koster
Sent: Monday, June 06, 2011 1:55 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Is it kosher to use random-intercept estimates as
explanatory variables in another model?

I'm reviewing a paper for a colleague, and I haven't seen this done before.

Imagine that she has a sample of 100 houses, all of which include children
who raise chickens.  She includes a random term for household and finds that
there is substantial household-level variance in chicken husbandry by kids.

She then takes the household-level estimates (i.e., plus/minus relative to
the model intercept) and uses them as an explanatory variable in an OLS
model with households as the sampling unit.  For example, she would predict
something like household-level income while using the random-intercept
estimates from the chicken analysis (and other covariates).

At first glance, this might seem relatively straightforward, but I haven't
encountered similar analyses, and I'm wondering about potential pitfalls . .
. particularly given the variable number of kids in each house.

Any thoughts?

Thanks!

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models