Dear members of mixed-models list,
I'm adressing you in order to ask a question about Hierarchical and ZI counts measured over time.
To have preliminar results, I'm modeling longitudinal data with a Negative Binomial GLMM, via
lme4 and glmmADBM packages (very similar results). I have considered two possibilities:
1) A single random intercept:
glmer.0.int.NB <- glmer.nb(counts ~ obstime + (1|id), data = tr.j) # lme4 package
tr.j$ID <- as.factor(tr.j$id)
glmmADMB.0.int.NB <- glmmadmb(claimyr ~ obstime + (1|ID), data = tr.j, family = "nbinom")
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.9652 0.1222 -7.9005 0.0000
obstime 0.0238 0.0073 3.2735 0.0011
2) Random intercept and random slope effects:
glmer.0.slp.NB <- glmer.nb(counts ~ obstime + (obstime|id), data = tr.j) # lme4 package
glmmADMB.0.slp.NB <- glmmadmb(claimyr ~ obstime + (obstime|ID), data = tr.j, family = "nbinom")
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.9401 0.1190 -7.9005 0.0000
obstime 0.0230 0.0075 3.0540 0.0023
Surprisingly, the anova test indicates non significant improvement by fitting second model:
anova(glmer.0.int.NB, glmer.0.slp.NB) # LRT: p-value = 0.2725 > 0.05
anova(glmmADMB.0.int.NB, glmmADMB.0.slp.NB) # LRT: p-value = 0.1042 > 0.05
As far as I know, when dealing repeated measurements across time, we expect that outcomes closer in time to be
more correlated (it is indeed a more realistic approach), so I'm totally disconcerted by this result.
Can anyone explain what could be the reason?
Best,
Xavier
Random slope does not improve hierarchical fitting across time
4 messages · xavier piulachs, Ben Bolker, Phillip Alday
On 16-07-29 03:09 PM, xavier piulachs wrote:
Dear members of mixed-models list,
I'm adressing you in order to ask a question about Hierarchical and ZI counts measured over time.
To have preliminar results, I'm modeling longitudinal data with a Negative Binomial GLMM, via
lme4 and glmmADBM packages (very similar results). I have considered two possibilities:
1) A single random intercept:
glmer.0.int.NB <- glmer.nb(counts ~ obstime + (1|id), data = tr.j) # lme4 package
tr.j$ID <- as.factor(tr.j$id)
glmmADMB.0.int.NB <- glmmadmb(claimyr ~ obstime + (1|ID), data = tr.j, family = "nbinom")
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.9652 0.1222 -7.9005 0.0000
obstime 0.0238 0.0073 3.2735 0.0011
2) Random intercept and random slope effects:
glmer.0.slp.NB <- glmer.nb(counts ~ obstime + (obstime|id), data = tr.j) # lme4 package
glmmADMB.0.slp.NB <- glmmadmb(claimyr ~ obstime + (obstime|ID), data = tr.j, family = "nbinom")
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.9401 0.1190 -7.9005 0.0000
obstime 0.0230 0.0075 3.0540 0.0023
Surprisingly, the anova test indicates non significant improvement by fitting second model:
anova(glmer.0.int.NB, glmer.0.slp.NB) # LRT: p-value = 0.2725 > 0.05
anova(glmmADMB.0.int.NB, glmmADMB.0.slp.NB) # LRT: p-value = 0.1042 > 0.05
As far as I know, when dealing repeated measurements across time, we expect that outcomes closer in time to be
more correlated (it is indeed a more realistic approach), so I'm totally disconcerted by this result.
Can anyone explain what could be the reason?
A few comments:
- most important: just because an effect is 'really' in the model (e.g.,
in this case, the effect of time really does vary among individuals)
doesn't mean it will have a statistically significant effect. In most
observational/complex fields (population biology, social sciences),
*all* of the effects are really non-zero. The purpose of significance
tests is to see which effects can be distinguished from noise.
- your explanation ("outcomes closer in time are more correlated") isn't
a very precise description of what the (obstime|ID) term in the model is
doing. Your description is of an autoregressive model; the (obstime|ID)
model is a random-slope model (slopes with respect to time vary among
individuals). You might want to check out the glmmTMB package for
autoregressive models ...
- glmmADMB's default correlation structure is diagonal, glmer.nb's is
unstructured; if you use (obstime||ID) in glmer.nb or corStruct="full"
in glmmadmb you should get more similar results (I would generally
recommend "full" as the default ...)
- likelihood ratio tests (which is what anova() is doing) generally give
conservative p-values when applied to random-effect variances (boundary
issues -- see http://tinyurl.com/glmmFAQ.html or Bolker (2009) or
Pinheiro and Bates 2000 for more discussion) -- so the p-values should
probably be approximately halved
Dear Ben, First of all, many thanks for your quick response. Moreover, I'm aware that you are an expertise in this field, so I'm doubly happy of receiving your comments. I have two doubts about what you say (the clue point is maybe the first): 1) The effect of time is in the model as fixed effect (and it is significant), ok. But I also would expect that each subject, i = 1,...,n, has: a) His underlying baseline level (ie, a subject-specific baseline effect = beta0 + random intercept = beta0 + ui0) ,and b) A particular trend-evolution across time (a subject especific slope = fixed effect of time + random slope = beta_t + uit). It is indeed very common when dealing repeated measurements across time (a particular case of longitudinal models) to have these two significant effects. In fact, always I have fitted longitudinal measurements over time (with unstructure matrix correlation by default), I got that random intercept and slope model improves the accuracy of considering a single random intercept. So I think this is compatible with the idea of an autoregressive model. Is it correct? 2) I have fitted the GLMM with the option corStruct = "full": glmmADMB.0.int.NB <- glmmadmb(claimyr ~ obstime + (1|ID), corStruct = "full", data = tr.j, family = "nbinom") And I get the following R error message: Parameters were estimated, but standard errors were not: the most likely problem is that the curvature at MLE was zero or negative The function maximizer failed (couldn't find parameter file) Best, Xavier
De: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> en nombre de Ben Bolker <bbolker at gmail.com>
Enviado: viernes, 29 de julio de 2016 19:48
Para: r-sig-mixed-models at r-project.org
Asunto: Re: [R-sig-ME] Random slope does not improve hierarchical fitting across time
On 16-07-29 03:09 PM, xavier piulachs wrote:
> Dear members of mixed-models list,
>
>
> I'm adressing you in order to ask a question about Hierarchical and ZI counts measured over time.
> To have preliminar results, I'm modeling longitudinal data with a Negative Binomial GLMM, via
> lme4 and glmmADBM packages (very similar results). I have considered two possibilities:
>
>
> 1) A single random intercept:
> glmer.0.int.NB <- glmer.nb(counts ~ obstime + (1|id), data = tr.j) # lme4 package
>
> tr.j$ID <- as.factor(tr.j$id)
> glmmADMB.0.int.NB <- glmmadmb(claimyr ~ obstime + (1|ID), data = tr.j, family = "nbinom")
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.9652 0.1222 -7.9005 0.0000
> obstime 0.0238 0.0073 3.2735 0.0011
>
>
> 2) Random intercept and random slope effects:
> glmer.0.slp.NB <- glmer.nb(counts ~ obstime + (obstime|id), data = tr.j) # lme4 package
>
> glmmADMB.0.slp.NB <- glmmadmb(claimyr ~ obstime + (obstime|ID), data = tr.j, family = "nbinom")
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -0.9401 0.1190 -7.9005 0.0000
> obstime 0.0230 0.0075 3.0540 0.0023
>
>
> Surprisingly, the anova test indicates non significant improvement by fitting second model:
>
> anova(glmer.0.int.NB, glmer.0.slp.NB) # LRT: p-value = 0.2725 > 0.05
> anova(glmmADMB.0.int.NB, glmmADMB.0.slp.NB) # LRT: p-value = 0.1042 > 0.05
>
>
> As far as I know, when dealing repeated measurements across time, we expect that outcomes closer in time to be
> more correlated (it is indeed a more realistic approach), so I'm totally disconcerted by this result.
> Can anyone explain what could be the reason?
A few comments:
- most important: just because an effect is 'really' in the model (e.g.,
in this case, the effect of time really does vary among individuals)
doesn't mean it will have a statistically significant effect. In most
observational/complex fields (population biology, social sciences),
*all* of the effects are really non-zero. The purpose of significance
tests is to see which effects can be distinguished from noise.
- your explanation ("outcomes closer in time are more correlated") isn't
a very precise description of what the (obstime|ID) term in the model is
doing. Your description is of an autoregressive model; the (obstime|ID)
model is a random-slope model (slopes with respect to time vary among
individuals). You might want to check out the glmmTMB package for
autoregressive models ...
- glmmADMB's default correlation structure is diagonal, glmer.nb's is
unstructured; if you use (obstime||ID) in glmer.nb or corStruct="full"
in glmmadmb you should get more similar results (I would generally
recommend "full" as the default ...)
- likelihood ratio tests (which is what anova() is doing) generally give
conservative p-values when applied to random-effect variances (boundary
issues -- see http://tinyurl.com/glmmFAQ.html or Bolker (2009) or
Pinheiro and Bates 2000 for more discussion) -- so the p-values should
probably be approximately halved
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
Dear Xavier, Both of your issues speak towards insufficient data (power too low) or a model specification not suited for the data. See Bates et al Parsimonious Mixed Models (http://arxiv.org/abs/1506.04967) for a discussion on some of these issues. Bottom line: you can't estimate all of your parameters precisely enough (large standard errors or inability to estimate them) with the data you have (either because there's not enough or because it isn't well described by your model specification) and so you fail to get to achieve significance.? Best, Phillip
On Fri, 2016-07-29 at 20:32 +0000, xavier piulachs wrote:
Dear Ben, First of all, many thanks for your quick response. Moreover, I'm aware that you are an expertise in this field, so I'm doubly happy of receiving your comments. I have two doubts about what you say (the clue point is maybe the first): 1) The effect of time is in the model as fixed effect (and it is significant), ok. But I also would expect that each subject, i = 1,...,n, has: ???a) His underlying baseline level (ie, a subject-specific baseline effect = beta0 + random intercept = beta0 + ui0) ,and ???b) A particular trend-evolution across time (a subject especific slope = fixed effect of time + random slope = beta_t + uit). It is indeed very common when dealing repeated measurements across time (a particular case of longitudinal models) to have these two significant effects.??In fact, always I have fitted longitudinal measurements over time (with unstructure matrix correlation by default), I got that random intercept and slope model improves the accuracy of considering a single random intercept. So I think this is compatible with the idea of an autoregressive model. Is it correct? ?2) I have fitted the GLMM with the option corStruct = "full": glmmADMB.0.int.NB <- glmmadmb(claimyr ~ obstime + (1|ID), corStruct = "full", data = tr.j, family = "nbinom") And I get the following R error message: Parameters were estimated, but standard errors were not: the most likely problem is that the curvature at MLE was zero or negative The function maximizer failed (couldn't find parameter file) Best, Xavier
________________________________
De: R-sig-mixed-models <r-sig-mixed-models-bounces at r-project.org> en
nombre de Ben Bolker <bbolker at gmail.com>
Enviado: viernes, 29 de julio de 2016 19:48
Para: r-sig-mixed-models at r-project.org
Asunto: Re: [R-sig-ME] Random slope does not improve hierarchical
fitting across time
On 16-07-29 03:09 PM, xavier piulachs wrote:
Dear members of mixed-models list,
I'm adressing you in order to ask a question about Hierarchical and
ZI counts measured over time.
To have preliminar results, I'm modeling longitudinal data with a
Negative Binomial GLMM, via
lme4 and glmmADBM packages (very similar results). I have
considered two possibilities:
1) A single random intercept:
glmer.0.int.NB <- glmer.nb(counts ~ obstime + (1|id), data =
tr.j)????# lme4 package
tr.j$ID <- as.factor(tr.j$id)
glmmADMB.0.int.NB <- glmmadmb(claimyr ~ obstime + (1|ID), data =
tr.j, family = "nbinom")
????????????Estimate Std. Error z value Pr(>|z|)
(Intercept)??-0.9652?????0.1222 -7.9005???0.0000
obstime???????0.0238?????0.0073??3.2735???0.0011
2) Random intercept and random slope effects:
glmer.0.slp.NB <- glmer.nb(counts ~ obstime + (obstime|id), data =
tr.j)???#??lme4 package
glmmADMB.0.slp.NB <- glmmadmb(claimyr ~ obstime + (obstime|ID),
data = tr.j, family = "nbinom")
????????????Estimate Std. Error z value Pr(>|z|)
(Intercept)??-0.9401?????0.1190 -7.9005???0.0000
obstime???????0.0230?????0.0075??3.0540???0.0023
Surprisingly, the anova test indicates non significant improvement
by fitting second model:
anova(glmer.0.int.NB, glmer.0.slp.NB)??# LRT: p-value = 0.2725 >
0.05
anova(glmmADMB.0.int.NB, glmmADMB.0.slp.NB) # LRT: p-value = 0.1042
0.05
As far as I know, when dealing repeated measurements across time,
we expect that outcomes closer in time to be
more correlated (it is indeed a more realistic approach), so I'm
totally disconcerted by this result.
Can anyone explain what could be the reason?
? A few comments:
- most important: just because an effect is 'really' in the model
(e.g.,
in this case, the effect of time really does vary among individuals)
doesn't mean it will have a statistically significant effect. In most
observational/complex fields (population biology, social sciences),
*all* of the effects are really non-zero. The purpose of significance
tests is to see which effects can be distinguished from noise.
- your explanation ("outcomes closer in time are more correlated")
isn't
a very precise description of what the (obstime|ID) term in the model
is
doing.??Your description is of an autoregressive model; the
(obstime|ID)
model is a random-slope model (slopes with respect to time vary among
individuals).??You might want to check out the glmmTMB package for
autoregressive models ...
- glmmADMB's default correlation structure is diagonal, glmer.nb's is
unstructured; if you use (obstime||ID) in glmer.nb
or????corStruct="full"
in glmmadmb you should get more similar results (I would generally
recommend "full" as the default ...)
- likelihood ratio tests (which is what anova() is doing) generally
give
conservative p-values when applied to random-effect variances
(boundary
issues -- see http://tinyurl.com/glmmFAQ.html or Bolker (2009) or
Pinheiro and Bates 2000 for more discussion) -- so the p-values
should
probably be approximately halved
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
[[alternative HTML version deleted]]
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models