Dear Steve,
There's a lot in your question. A couple thoughts:
(1) I'm not clear whether you *country-mean-centered* your
individual-level covariates. If not, the country means of those variables
could (i.e., almost certainly will) correlate to some degree with your
country-level variables. This will almost certainly just confuse matters,
such that it would be best to do the mean-centering. (If you want to
include, say, national mean education as a covariate, in addition to
de-meaned individual level education, you can do so... But that will
probably correlate a lot with, say, GDP/capita.) From what you say,
mean-centering will get you what you want, and it actually might also help
you deal with the unhelpful reviewer comments you're getting. (I totally
agree with your reactions to those. Given what appears to the paucity of
logic behind their comments, surreptitiously not doing what they're saying
but appearing to do what they're saying seems a reasonable strategy.
Implicitly including country means increasing your degrees of freedom at
the country level, causing a reduction in efficiency, as you suggest...
Though it's an issue of collinearity, not just missingness.)
So I think you're wrong that "individual-level variables don't
meaningfully influence the parameter estimates for country-level variables
beyond inefficiency introduced by missing data." But I think you can
nonetheless ignore them--because only the country mean components are
having the impacts you describe, and you seem to have substantive reasons
to remove those components.
(2) Like you, I've never found the "correlation of fixed effects" output
very useful. I generally just suppress/ignore it.
Hope that helps.
- Malcolm
Dr Malcolm Fairbrother
Senior Lecturer in Global Policy and Politics
School of Geographical Sciences
University of Bristol
Date: Mon, 22 Feb 2016 15:46:17 -0500
From: svm <steven.v.miller at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] For what can I use a correlation of fixed effects
from (g)lmer?
Hi all,
I have a question that concerns how could I possibly use a correlation of
fixed effects that comes standard with every (g)lmer call. I'll explain
the
situation I'm encountering briefly.
- I used mixed effects models mostly for cross-national survey
research.
I have both individual-level fixed effects and country-level fixed
effects.
- My interest is mostly the country-level fixed effects. The
individual-level stuff tends to be standard "controls" that reviewers
want
to see.
- I'm not convinced the individual-level fixed effects are entirely
necessary. My hunch is they just make for inefficient estimates of the
country-level fixed effects that interest me. The individual-level
variables just create missing data problems. However, they're stuff
that
reviewers insist on seeing absent any other information about what a
mixed
effects model is doing.
I have a project (manuscript here:
https://www.dropbox.com/s/harb6ylpcxdpalr/etst.pdf?dl=0 | appendix here:
https://www.dropbox.com/s/pq8gmr7v1xvvu2h/etst-appendix.pdf?dl=0) that
reviewers rejected because the country-level fixed effects were rendered
statistically insignificant (i.e. not discernible from zero) upon the
inclusion of the individual-level variables. They said that one
individual-level attribute (which by itself contributes to listwise
deletion of 30% of the data) somehow made the country-level fixed effects
"spurious" to its inclusion. This already strikes me as a bold claim for
theoretical and statistical reasons, but here's what I did to circumvent
this claim:
- Estimate just the country-level fixed effects.
- Use multiple imputation to generate five full data sets and merge in
the macro-level information after the imputation. The results for the
country-level fixed effects were almost identical to the analyses with
just
the country-level fixed effects.
- Omit the offending individual-level variables that contribute the
most
missingness. These results were consistent with the results from the
other
two estimation strategies.
However, the reviewers just didn't buy it and torpedoed the manuscript.
Is this something that the correlation of fixed effects could be useful in
addressing? Here's the correlation of fixed effects (without the
intercepts) for the analysis in question. In this analysis, the three
variables at the bottom row (i.e. the two threat indices and the level of
democracy) are the country-level variables for this cross-national survey
analysis. The other variables are individual-level attributes. It's worth
reiterating that every variable that is not binary is scaled by two
standard deviations to create a meaningful zero.
http://i.imgur.com/eIiZH9b.png
Notice that the bottom-left quadrant is entirely white (i.e. the
correlation of the individual-level fixed effects with the country-level
fixed effects is basically zero). Is this telling me that the correlation
for any one individual-level fixed effect and a country-level fixed effect
is almost zero (i.e. they have almost no bearing on each other)? The most
I've seen anyone discuss this correlation matrix is here:
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/001941.html
It is an approximate correlation of the estimator of the fixed
effects. (I include the word "approximate" because I should but in
this case the approximation is very good.) I'm not sure how to
explain it better than that. Suppose that you took an MCMC sample
from the parameters in the model, then you would expect the sample of
the fixed-effects parameters to display a correlation structure like
this matrix.
and here (
http://stats.stackexchange.com/questions/57240/how-do-i-interpret-the-correlations-of-fixed-effects-in-my-glmer-output
):
The "correlation of fixed effects" output doesn't have the intuitive
meaning that most would ascribe to it. Specifically, is not about the
correlation of the variables (as OP notes). It is in fact about the
expected correlation of the regression coefficients. Although this may
speak to multicollinearity it does not necessarily.
I should add that I've estimated hundreds of mixed effects models with
individual-level and country-level variables and they all have fixed
effects correlation matrices that resemble these. I have a strong hunch
that individual-level variables don't meaningfully influence the parameter
estimates for country-level variables beyond inefficiency introduced by
missing data. In research projects where individual-level attributes don't
concern the project, I'd like to ignore them for that reason. They just
create estimation problems and slow down computation.
I might be mistaken, which is why I ask here. I thank you for your time.