Skip to content

Help: Interpreting unusual model fit results from generalized linear mixed model (glmmTMB/sjPlot) (Andre Syvertsen)

2 messages · Andre Syvertsen, Daniel Lüdecke

#
Text version of table:

  Active Gambling Days Monthly
Predictors Incidence Rate Ratios, CI,  p
(Intercept):   1.18, 1.16 ? 1.20,  <0.001
Time : 0.99, 0.99 ? 0.99,  <0.001
Age Category 30-39:  1.29, 1.26 ? 1.32, <0.001
Age Category 40-49:  1.81, 1.78 ? 1.85, <0.001
Age Category 50-59:  2.47, 2.41 ? 2.53, <0.001
Age Category 60-69:  3.08, 2.99 ? 3.17, <0.001
Age Category 70+:  3.42, 3.29 ? 3.56, <0.001
Gender-Women:   1.69, 1.65 ? 1.74, <0.001
Age Category 30-39-Women: 0.90, 0.86 ? 0.94, <0.001
Age Category 40-49-Women: 0.67, 0.65 ? 0.70, <0.001
Age Category 50-59-Women: 0.53, 0.50 ? 0.55, <0.001
Age Category 60-69- Women: 0.46, 0.44 ? 0.48, <0.001
Age Category 70+-Women:  0.45, 0.43 ? 0.48, <0.001

Random Effects
?2 0.00
?00 id 1.63
?11 id.time 0.00
?01 id -0.38
ICC 1.00
N id 184113

Observations 3,231,544
Marginal R2 / Conditional R2 0.087 / 1.000
Note: Intercept = Men, age 18-29 years, first time point (month 0 of 69).


________________________________
Fra: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> p? vegne av r-sig-mixed-models-request at r-project.org <r-sig-mixed-models-request at r-project.org>
Sendt: fredag 26. februar 2021 11:21
Til: r-sig-mixed-models at r-project.org <r-sig-mixed-models at r-project.org>
Emne: R-sig-mixed-models Digest, Vol 170, Issue 20

Send R-sig-mixed-models mailing list submissions to
        r-sig-mixed-models at r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
        r-sig-mixed-models-request at r-project.org

You can reach the person managing the list at
        r-sig-mixed-models-owner at r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."


Today's Topics:

   1. Residual Variance-Covariance matrix (Yashree Mehta)
   2. Re: lmer code for multiple random slopes (Phillip Alday)
   3. question regarding time as a continuous factor in a linear
      mixed effects model (Laura Coco)
   4. Help: Interpreting unusual model fit results from generalized
      linear mixed model (glmmTMB/sjPlot) (Andre Syvertsen)

----------------------------------------------------------------------

Message: 1
Date: Thu, 25 Feb 2021 16:16:08 +0100
From: Yashree Mehta <yashree19 at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Residual Variance-Covariance matrix
Message-ID:
        <CAOE=hqKTTHxSChkLvxuO1USoCtk+ZmBbKtr6kDhxu=Xa438pZA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I am using the following the code to extract the residual
variance-covariance matrix cov(Yi/Xi):

I first fit the model with the name lmemod_lme4. I have an unbalanced
cluster dataset.

Then, I extracted components with the following:

var.d <- crossprod(getME(lmemod_lme4,"Lambdat"))
Zt <- getME(lmemod_lme4,"Zt")
vr <- sigma(lmemod_lme4)^2

Then, I combine them with the following:

var.b <- t(Zt) %*% var.d %*% Zt
sI <- vr * Diagonal(nrow(Nameofdataset))
var.y <- var.b + sI

I have 2799 observations in my dataset. MY var.y matrix has dimension
2799 times 2799. Is there now a way to extract the

cov(Yi/Xi) for each observation? Also, I get a very large value for
the determinant of var.y.

I would be grateful to get further guidance on this.

Thank you very much.

Regards





------------------------------

Message: 2
Date: Thu, 25 Feb 2021 16:26:25 +0100
From: Phillip Alday <me at phillipalday.com>
To: Peter R Law <prldb at protonmail.com>,
        "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>,
        Ben Bolker <bbolker at gmail.com>
Subject: Re: [R-sig-ME] lmer code for multiple random slopes
Message-ID: <f286a096-f420-c246-61ad-542eaa5dad44 at phillipalday.com>
Content-Type: text/plain; charset="utf-8"
On 25/2/21 5:16 am, Peter R Law via R-sig-mixed-models wrote:
This is to be expected: lme can't handle singular models (e.g. models
with an exactly zero variance component), but lmer can. So lme just gets
as close as it can.
It depends on what's going wrong. Generally these should return
identical fits for converged models (with the fine print that there are
certain models that you can specify in one but not the other without a
lot of effort).
The list strips most attachment types, so your script didn't make it
through. But just your data was enough for me to find a few problems

First, you're fitting and displaying two different models here:
Second, when I try to fit the model that's given by the summary here, I
get a numerical error in lme:
Error in lme.formula(Response ~ P + A, random = ~P + A | Group, data =
Trial,  :
  nlminb problem, convergence error code = 1
  message = false convergence (8)


This makes me think that something is wrong wrong in both bits of
software, but maybe the versions on your local machine are catching it
This screams that your normP and normA variables aren't being handled as
continuous but rather categorical/factors. This means that you have a
lot more slopes being estimated, which the model survives, but not the
extra correlation parameters (hence the success when you suppress them
with the two syntax variants you mention below).
Yep! Frequently, we have the reverse problem: people forget to keep the
list in CC. So thanks! The matching in the threading is handled mostly
by the subject line.
------------------------------

Message: 3
Date: Thu, 25 Feb 2021 10:36:28 -0700
From: Laura Coco <lauracoco at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] question regarding time as a continuous factor in
        a linear mixed effects model
Message-ID:
        <CAKgKnP6Nx9Ee=vhPd6JtbUrzw4iyeXjh-=FZo9zCTo5RtcTjkQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hello,

I am interested in investigating the main effects of group, time, and group
by time interaction on survey outcomes using linear mixed effects models.
Time is considered as continuous (number of days since baseline), but isn't
it also categorical, since I want to compare Session 1 vs Session 4 (for
example)? How is that handled in the model? As of now, time (days since
baseline) is being treated as one unit, rather than four separate sessions.

Here is an example of my code: mdl.outcome <- lmer(outcome ~ time*Group +
(1 | PID), data = dta)

Thank you!!





------------------------------

Message: 4
Date: Fri, 26 Feb 2021 10:20:18 +0000
From: Andre Syvertsen <Andre.Syvertsen at uib.no>
To: "r-sig-mixed-models at r-project.org"
        <r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] Help: Interpreting unusual model fit results from
        generalized linear mixed model (glmmTMB/sjPlot)
Message-ID:
        <AM6PR0102MB31748BFF6EAF4FEA67D042809E9D9 at AM6PR0102MB3174.eurprd01.prod.exchangelabs.com>

Content-Type: text/plain; charset="utf-8"

Hi,

I am working with a large dataset that contains longitudinal data on gambling behavior of 184,113 participants. The data is based on complete tracking of electronic gambling behavior within a gambling operator. Gambling behavior data is aggregated on a monthly level, a total of 70 months. I have an ID variable separating participants, a time variable (months), as well as numerous gambling behavior variables such as active days played for given month, bets placed for given month, total losses for given month, etc. I am investigating the role of age and gender in predicting active days gambling per month.

I have fitted a model with glmmTMB (see below for model code) and outputed the resulting statistics with sjPlot's tab_model function which I am having trouble interpreting. The full results can be found below. Notably, I appear to have gotten perfect intra-class correlation. While, I am sure variance in outcome responses (active days gambling) are likely to be heavily associated with subject and time, this seems excessive. Furthermore, the pseudo R2 suggests that 8.7% of the variance should be attributable to fixed effects which I would think would lower the variance attributable to individual/time? Are the results affected by the high number of observations, individuals and/or time points? Or maybe I have specified my model in an odd manner?

glmmTMB code for the model:

DaysPlayedConditionalAgeGenderTruncated <- glmmTMB(daysPlayed ~ 1 + time + ageCategory * gender + (time | id), dfLong, family = truncated_nbinom2)

Model summary:

Active Gambling Days Monthly
Predictors
Incidence Rate Ratios
CI
p
(Intercept)
1.18
1.16 ?C 1.20
<0.001
Time
0.99
0.99 ?C 0.99
<0.001
Age Category 30-39
1.29
1.26 ?C 1.32
<0.001
Age Category 40-49
1.81
1.78 ?C 1.85
<0.001
Age Category 50-59
2.47
2.41 ?C 2.53
<0.001
Age Category 60-69
3.08
2.99 ?C 3.17
<0.001
Age Category 70+
3.42
3.29 ?C 3.56
<0.001
Gender: Women
1.69
1.65 ?C 1.74
<0.001
Age Category 30-39:Women
0.90
0.86 ?C 0.94
<0.001
Age Category 40-49:Women
0.67
0.65 ?C 0.70
<0.001
Age Category 50-59:Women
0.53
0.50 ?C 0.55
<0.001
Age Category 60-69: Women
0.46
0.44 ?C 0.48
<0.001
Age Category 70+:Women
0.45
0.43 ?C 0.48
<0.001
Random Effects
??2
0.00
??00 id
1.63
??11 id.time
0.00
??01 id
-0.38
ICC
1.00
N id
184113
Observations
3231544
Marginal R2 / Conditional R2
0.087 / 1.000
Note. Intercept = Men, age 18-29 years, first time point (month 0 of 69).


Kind regards,
Andr??




------------------------------

Subject: Digest Footer

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


------------------------------

End of R-sig-mixed-models Digest, Vol 170, Issue 20
***************************************************
#
Dear Andre,

the ICC is calculated by dividing the random effect variance by the total variance, i.e. the sum of the random effect variance and the residual variance. Usually, for longitudinal models, where you have a random slope and - due to the subject ID as group factor in the random effects - many groups, the proportion of the variance explained by this grouping structure is rather high. That means, for longitudinal designs, a high(er) ICC is often expected.

Additionally, in your case, the residual variance is almost 0 (see output), thus, the ICC is literally the random effect variance divided by the random effect variance - which is ~ 1.

Your question regarding the marginal R2 of 8.7% (i.e. ~ 8.7% explained variance of attributable to fixed effects) *could* probably due to computational issues. For random slope models, the ICC can be different for each level / value of the random slope. Thus, icc() reports a kind of "averaged" ICC (see ?icc and references), which might not exactly reflect the explained variance attributable to the random effects.

Best
Daniel

-----Urspr?ngliche Nachricht-----
Von: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> Im Auftrag von Andre Syvertsen
Gesendet: Freitag, 26. Februar 2021 12:35
An: r-sig-mixed-models at r-project.org
Betreff: Re: [R-sig-ME] Help: Interpreting unusual model fit results from generalized linear mixed model (glmmTMB/sjPlot) (Andre Syvertsen)

Text version of table:

  Active Gambling Days Monthly
Predictors Incidence Rate Ratios, CI,  p
(Intercept):   1.18, 1.16 ? 1.20,  <0.001
Time : 0.99, 0.99 ? 0.99,  <0.001
Age Category 30-39:  1.29, 1.26 ? 1.32, <0.001
Age Category 40-49:  1.81, 1.78 ? 1.85, <0.001
Age Category 50-59:  2.47, 2.41 ? 2.53, <0.001
Age Category 60-69:  3.08, 2.99 ? 3.17, <0.001
Age Category 70+:  3.42, 3.29 ? 3.56, <0.001
Gender-Women:   1.69, 1.65 ? 1.74, <0.001
Age Category 30-39-Women: 0.90, 0.86 ? 0.94, <0.001
Age Category 40-49-Women: 0.67, 0.65 ? 0.70, <0.001
Age Category 50-59-Women: 0.53, 0.50 ? 0.55, <0.001
Age Category 60-69- Women: 0.46, 0.44 ? 0.48, <0.001
Age Category 70+-Women:  0.45, 0.43 ? 0.48, <0.001

Random Effects
?2 0.00
?00 id 1.63
?11 id.time 0.00
?01 id -0.38
ICC 1.00
N id 184113

Observations 3,231,544
Marginal R2 / Conditional R2 0.087 / 1.000
Note: Intercept = Men, age 18-29 years, first time point (month 0 of 69).


________________________________
Fra: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> p? vegne av r-sig-mixed-models-request at r-project.org <r-sig-mixed-models-request at r-project.org>
Sendt: fredag 26. februar 2021 11:21
Til: r-sig-mixed-models at r-project.org <r-sig-mixed-models at r-project.org>
Emne: R-sig-mixed-models Digest, Vol 170, Issue 20

Send R-sig-mixed-models mailing list submissions to
        r-sig-mixed-models at r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
        https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
or, via email, send a message with subject or body 'help' to
        r-sig-mixed-models-request at r-project.org

You can reach the person managing the list at
        r-sig-mixed-models-owner at r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of R-sig-mixed-models digest..."


Today's Topics:

   1. Residual Variance-Covariance matrix (Yashree Mehta)
   2. Re: lmer code for multiple random slopes (Phillip Alday)
   3. question regarding time as a continuous factor in a linear
      mixed effects model (Laura Coco)
   4. Help: Interpreting unusual model fit results from generalized
      linear mixed model (glmmTMB/sjPlot) (Andre Syvertsen)

----------------------------------------------------------------------

Message: 1
Date: Thu, 25 Feb 2021 16:16:08 +0100
From: Yashree Mehta <yashree19 at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Residual Variance-Covariance matrix
Message-ID:
        <CAOE=hqKTTHxSChkLvxuO1USoCtk+ZmBbKtr6kDhxu=Xa438pZA at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hi,

I am using the following the code to extract the residual
variance-covariance matrix cov(Yi/Xi):

I first fit the model with the name lmemod_lme4. I have an unbalanced
cluster dataset.

Then, I extracted components with the following:

var.d <- crossprod(getME(lmemod_lme4,"Lambdat"))
Zt <- getME(lmemod_lme4,"Zt")
vr <- sigma(lmemod_lme4)^2

Then, I combine them with the following:

var.b <- t(Zt) %*% var.d %*% Zt
sI <- vr * Diagonal(nrow(Nameofdataset))
var.y <- var.b + sI

I have 2799 observations in my dataset. MY var.y matrix has dimension
2799 times 2799. Is there now a way to extract the

cov(Yi/Xi) for each observation? Also, I get a very large value for
the determinant of var.y.

I would be grateful to get further guidance on this.

Thank you very much.

Regards





------------------------------

Message: 2
Date: Thu, 25 Feb 2021 16:26:25 +0100
From: Phillip Alday <me at phillipalday.com>
To: Peter R Law <prldb at protonmail.com>,
        "r-sig-mixed-models at r-project.org" <r-sig-mixed-models at r-project.org>,
        Ben Bolker <bbolker at gmail.com>
Subject: Re: [R-sig-ME] lmer code for multiple random slopes
Message-ID: <f286a096-f420-c246-61ad-542eaa5dad44 at phillipalday.com>
Content-Type: text/plain; charset="utf-8"
On 25/2/21 5:16 am, Peter R Law via R-sig-mixed-models wrote:
This is to be expected: lme can't handle singular models (e.g. models
with an exactly zero variance component), but lmer can. So lme just gets
as close as it can.
It depends on what's going wrong. Generally these should return
identical fits for converged models (with the fine print that there are
certain models that you can specify in one but not the other without a
lot of effort).
The list strips most attachment types, so your script didn't make it
through. But just your data was enough for me to find a few problems

First, you're fitting and displaying two different models here:
Second, when I try to fit the model that's given by the summary here, I
get a numerical error in lme:
Error in lme.formula(Response ~ P + A, random = ~P + A | Group, data =
Trial,  :
  nlminb problem, convergence error code = 1
  message = false convergence (8)


This makes me think that something is wrong wrong in both bits of
software, but maybe the versions on your local machine are catching it
This screams that your normP and normA variables aren't being handled as
continuous but rather categorical/factors. This means that you have a
lot more slopes being estimated, which the model survives, but not the
extra correlation parameters (hence the success when you suppress them
with the two syntax variants you mention below).
Yep! Frequently, we have the reverse problem: people forget to keep the
list in CC. So thanks! The matching in the threading is handled mostly
by the subject line.
------------------------------

Message: 3
Date: Thu, 25 Feb 2021 10:36:28 -0700
From: Laura Coco <lauracoco at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] question regarding time as a continuous factor in
        a linear mixed effects model
Message-ID:
        <CAKgKnP6Nx9Ee=vhPd6JtbUrzw4iyeXjh-=FZo9zCTo5RtcTjkQ at mail.gmail.com>
Content-Type: text/plain; charset="utf-8"

Hello,

I am interested in investigating the main effects of group, time, and group
by time interaction on survey outcomes using linear mixed effects models.
Time is considered as continuous (number of days since baseline), but isn't
it also categorical, since I want to compare Session 1 vs Session 4 (for
example)? How is that handled in the model? As of now, time (days since
baseline) is being treated as one unit, rather than four separate sessions.

Here is an example of my code: mdl.outcome <- lmer(outcome ~ time*Group +
(1 | PID), data = dta)

Thank you!!





------------------------------

Message: 4
Date: Fri, 26 Feb 2021 10:20:18 +0000
From: Andre Syvertsen <Andre.Syvertsen at uib.no>
To: "r-sig-mixed-models at r-project.org"
        <r-sig-mixed-models at r-project.org>
Subject: [R-sig-ME] Help: Interpreting unusual model fit results from
        generalized linear mixed model (glmmTMB/sjPlot)
Message-ID:
        <AM6PR0102MB31748BFF6EAF4FEA67D042809E9D9 at AM6PR0102MB3174.eurprd01.prod.exchangelabs.com>

Content-Type: text/plain; charset="utf-8"

Hi,

I am working with a large dataset that contains longitudinal data on gambling behavior of 184,113 participants. The data is based on complete tracking of electronic gambling behavior within a gambling operator. Gambling behavior data is aggregated on a monthly level, a total of 70 months. I have an ID variable separating participants, a time variable (months), as well as numerous gambling behavior variables such as active days played for given month, bets placed for given month, total losses for given month, etc. I am investigating the role of age and gender in predicting active days gambling per month.

I have fitted a model with glmmTMB (see below for model code) and outputed the resulting statistics with sjPlot's tab_model function which I am having trouble interpreting. The full results can be found below. Notably, I appear to have gotten perfect intra-class correlation. While, I am sure variance in outcome responses (active days gambling) are likely to be heavily associated with subject and time, this seems excessive. Furthermore, the pseudo R2 suggests that 8.7% of the variance should be attributable to fixed effects which I would think would lower the variance attributable to individual/time? Are the results affected by the high number of observations, individuals and/or time points? Or maybe I have specified my model in an odd manner?

glmmTMB code for the model:

DaysPlayedConditionalAgeGenderTruncated <- glmmTMB(daysPlayed ~ 1 + time + ageCategory * gender + (time | id), dfLong, family = truncated_nbinom2)

Model summary:

Active Gambling Days Monthly
Predictors
Incidence Rate Ratios
CI
p
(Intercept)
1.18
1.16 ?C 1.20
<0.001
Time
0.99
0.99 ?C 0.99
<0.001
Age Category 30-39
1.29
1.26 ?C 1.32
<0.001
Age Category 40-49
1.81
1.78 ?C 1.85
<0.001
Age Category 50-59
2.47
2.41 ?C 2.53
<0.001
Age Category 60-69
3.08
2.99 ?C 3.17
<0.001
Age Category 70+
3.42
3.29 ?C 3.56
<0.001
Gender: Women
1.69
1.65 ?C 1.74
<0.001
Age Category 30-39:Women
0.90
0.86 ?C 0.94
<0.001
Age Category 40-49:Women
0.67
0.65 ?C 0.70
<0.001
Age Category 50-59:Women
0.53
0.50 ?C 0.55
<0.001
Age Category 60-69: Women
0.46
0.44 ?C 0.48
<0.001
Age Category 70+:Women
0.45
0.43 ?C 0.48
<0.001
Random Effects
??2
0.00
??00 id
1.63
??11 id.time
0.00
??01 id
-0.38
ICC
1.00
N id
184113
Observations
3231544
Marginal R2 / Conditional R2
0.087 / 1.000
Note. Intercept = Men, age 18-29 years, first time point (month 0 of 69).


Kind regards,
Andr??




------------------------------

Subject: Digest Footer

_______________________________________________
R-sig-mixed-models mailing list
R-sig-mixed-models at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models


------------------------------

End of R-sig-mixed-models Digest, Vol 170, Issue 20
***************************************************


_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

--

_____________________________________________________________________

Universit?tsklinikum Hamburg-Eppendorf; K?rperschaft des ?ffentlichen Rechts; Gerichtsstand: Hamburg | www.uke.de
Vorstandsmitglieder: Prof. Dr. Burkhard G?ke (Vorsitzender), Joachim Pr?l?, Prof. Dr. Blanche Schwappach-Pignataro, Marya Verdel
_____________________________________________________________________

SAVE PAPER - THINK BEFORE PRINTING