Level 2 outcome and 'Downdated VtV' error
Thank you for these responses. I figured this was the case (that you shouldn't predict a Level 2 variable in a mixed model), but followed contrary advice from a colleague. Appreciate the help. Matt On Tue, Jul 7, 2020 at 6:16 AM Patrick (Malone Quantitative) <
malone at malonequantitative.com> wrote:
Agreed with the others. Chiming in only because I've recently been doing research on such aggregation and I can say the consensus seems to be it doesn't introduce bias (with the possible exception of very small clusters, which you don't have). On Tue, Jul 7, 2020 at 6:40 AM Viechtbauer, Wolfgang (SP) <wolfgang.viechtbauer at maastrichtuniversity.nl> wrote:
Hi Matt, What you are trying to do (i.e., use a level 2 variable as the outcome)
can and should not be done. The outcome in a multilevel model needs to be measured at the lowest level.
In your model (A1), we know a priori that there is 0 within-station
variability. Hence, the ICC is exactly equal to 1 in that model, but trying to fit such a model pushes the optimization routines into a situation that leads to degeneracies.
The only way to get around this is to aggregate the data to the level of
the outcome (i.e., use PopCov as the outcome and aggregate all other level 1 predictors to level 2 means).
Best, Wolfgang
-----Original Message----- From: R-sig-mixed-models [mailto:
r-sig-mixed-models-bounces at r-project.org]
On Behalf Of Matthew Boden Sent: Tuesday, 07 July, 2020 0:19 To: r-sig-mixed-models at r-project.org Subject: [R-sig-ME] Level 2 outcome and 'Downdated VtV' error Good afternoon, I am looking for advice regarding a multi-level model I am trying to implement using lme4. My two-level random-effects model won?t run,
perhaps
due to one or two issues. Background: Level 1 is patients, which are clustered in healthcare facilities (?Station?). The outcome is a continuous variable (?PopCov?) that is calculated at the facility-level, and is thus a Level 2 variable that does not vary at the patient level. The aim of this analysis is to examine whether PopCov is predicted by
(a)
patient-level (e.g., race/ethnicity, age, symptom severity), and (b) facility-level variables (e.g., overall racial/ethnic composition,
average
age). It is important to examine factors such as race/ethnicity at both patient and facility-levels because patients with different
racial/ethnic
backgrounds tend to differ in terms of age, symptom severity, etc. Each record/row in my data is a patient, with facility-level variables (including PopCov) having identical values among patients within a given facility. An error is thrown when I run a basic model. A1 <-lmer(PopCov ~ (1 | Station), data = DISP) *Error in fn9nM$xeval()) : Downdated VtV is not positive definite I obtain the same error when I add to the model either a patient-level
or
facility level predictor. An internet search suggested that I have complete separation of my data and/or poorly scaled variables. I assume this issue has to do with the fact that the outcome is a level
2
variable. Perhaps compounding the issue is the large and unbalanced
nature
of the data. I have ~6 million patients clustered in ~1000 healthcare facilities. Individual facilities have anywhere from 100 to 30000
patients
clustered in them. I could use some advice regarding how to specify the model to predict a facility-level variable (level 2) from both patient (level 1) and facility-level (level 2) variables with these data. Thank you in advance. Matt
_______________________________________________ R-sig-mixed-models at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
-- Patrick S. Malone, Ph.D., Malone Quantitative NEW Service Models: http://malonequantitative.com He/Him/His