Downdated VtV error for two level mixed model

Wed, Jul 8, 2020 10:36 AM #

Good afternoon,

I am looking for advice regarding a mixed model I am trying to implement
using lme4. My two-level random-effects model won?t run, perhaps due to one
or two issues.

Level 1 are patients clustered in healthcare facilities (?Station?). The
outcome is a continuous variable (?PopCov?) that is calculated at the
facility-level, and thus a level 2 variable that does not vary at the
patient level.

My aim is to examine the prediction of PopCov by (a) patient-level (e.g.,
race/ethnicity, age, symptom severity), and (b) facility-level variables
(e.g., overall racial/ethnic composition, average age). It is important to
examine race/ethnicity at both patient and facility-levels because patients
with different racial/ethnic backgrounds tend to differ in terms of age,
symptom severity, etc.

Each record/row in my data set is a patient, with facility-level variables
(including PopCov) having identical values among patients within a given
facility.

An error is thrown when I run a basic model.

A1 <-lmer(PopCov ~ (1 | Station), data = DISP)

*Error in fn9nM$xeval()) : Downdated VtV is not positive definite

I obtain the same error when I add to the model either a patient-level or
facility level predictor.

An internet search suggested that I have complete separation of my data
and/or poorly scaled variables.

I assume this issue has to do with the fact that the outcome is a level 2
variable. Perhaps compounding the issue is the large and unbalanced nature
of the data. I have ~6 million patients clustered in ~1000 healthcare
facilities. Individual facilities have anywhere from 100 to 30000 patients
clustered in them.

I could use some advice regarding how to specify the model to predict a
facility-level variable (level 2) from both patient (level 1) and
facility-level (level 2) variables with these data.

Thank you in advance.

Matt

Wolfgang Viechtbauer

Wed, Jul 8, 2020 10:53 AM #

Hi Matt,

You have already received some answers to your previous post:

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2020q3/028806.html
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2020q3/028807.html
https://stat.ethz.ch/pipermail/r-sig-mixed-models/2020q3/028808.html

Best,
Wolfgang

-----Original Message-----
From: R-sig-mixed-models [mailto:r-sig-mixed-models-bounces at r-project.org]
On Behalf Of Matthew Boden
Sent: Wednesday, 08 July, 2020 19:36
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Downdated VtV error for two level mixed model

Good afternoon,

I am looking for advice regarding a mixed model I am trying to implement
using lme4. My two-level random-effects model won?t run, perhaps due to one
or two issues.

Level 1 are patients clustered in healthcare facilities (?Station?). The
outcome is a continuous variable (?PopCov?) that is calculated at the
facility-level, and thus a level 2 variable that does not vary at the
patient level.

My aim is to examine the prediction of PopCov by (a) patient-level (e.g.,
race/ethnicity, age, symptom severity), and (b) facility-level variables
(e.g., overall racial/ethnic composition, average age). It is important to
examine race/ethnicity at both patient and facility-levels because patients
with different racial/ethnic backgrounds tend to differ in terms of age,
symptom severity, etc.

Each record/row in my data set is a patient, with facility-level variables
(including PopCov) having identical values among patients within a given
facility.

An error is thrown when I run a basic model.

A1 <-lmer(PopCov ~ (1 | Station), data = DISP)

*Error in fn9nM$xeval()) : Downdated VtV is not positive definite

I obtain the same error when I add to the model either a patient-level or
facility level predictor.

An internet search suggested that I have complete separation of my data
and/or poorly scaled variables.

I assume this issue has to do with the fact that the outcome is a level 2
variable. Perhaps compounding the issue is the large and unbalanced nature
of the data. I have ~6 million patients clustered in ~1000 healthcare
facilities. Individual facilities have anywhere from 100 to 30000 patients
clustered in them.

I could use some advice regarding how to specify the model to predict a
facility-level variable (level 2) from both patient (level 1) and
facility-level (level 2) variables with these data.

Thank you in advance.

Matt