confidence intervals with mvrnorm - upper value equal to inf

Thu, Oct 15, 2020 1:38 PM

Thanks for the clarifications.

So if I understand right, model diagnostics can tell me if my model
parameters are multivariate normal. I think they are, see the dharma
diagnostic plot attached.

The values you requested:
 > X.cond[1,]
  (Intercept) Used_piNgersN
            1             0

(Intercept) Used_piNgersN
    -19.77343      19.26693

(Intercept) Used_piNgersN
(Intercept)     168344705    -168344704
Used_piNgersN  -168344704     168344705

I noted that STd.Error values in the conditional model are very high
as well (12974.8).
I run the same model but set family as nbinom2

m1 <- glmmTMB(Dolphins.TOT ~ Used_piNgers + offset(log(Effort)) +
(1|tripID), ziformula = ~1,   data=x, family = "nbinom2")

and the Inf value do not appear. Was it an overdispersion problem? Can
I use the negative binomial model instead?

Sorry, I have tried to simulate my data to provide you with more
information but I am not sure I am doing it correctly. The effort
varies in each trial and I believe it can affect the value of
Dolphins.TOT.

On Wed, Oct 14, 2020 at 7:24 PM Ben Bolker <bbolker at gmail.com> wrote:

   A few clarifying points below.

On 10/14/20 3:11 PM, Alessandra Bielli wrote:

Hi Ben

That's a good point, my parameters are not multivariate normal.
But then, why was this method used in the Salamander example, where
data are zero inflated and so, I assume, not multivariate normal?

   The multivariate normality refers to the sampling distribution of the
*model parameters*, not to the distribution (marginal or conditional of
the response variable).  In general the sampling distribution of the
model parameters is multivariate normal if the data are "well enough
behaved" (i.e., either the responses themselves are normal *or* there is
enough data, and enough information in the data, for the log-likelihood
to become quadratic in a sufficiently large neighborhood of the MLE ...)

I haven't simulated the data to share yet, but the "Inf" first appears
at the line >pred.ucount.psim =
exp(pred.cond.psim)*(1-plogis(pred.zi.psim))
Only the first row, Used_piNgers=Y, shows "Inf".

    Hmm.  That's a little surprising, it implies that pred.cond.psim is
a large number (>700 or so, which for a log-scale parameter is huge).
Can you show us X.cond[1,] , beta.cond, vcov(m1)$cond ?

Thanks,
Alessandra

On Sun, Oct 11, 2020 at 3:46 PM Ben Bolker <bbolker at gmail.com> wrote:

    It's hard to troubleshoot this without a reproducible example. Unless
the answer is obvious -- which it's not, to me -- the easiest way to
troubleshoot is to work through the steps one at a time and see where
the infinite values first appear.  Can you create such an example either
by posting your data or by simulating data that looks like your data?

    The posterior predictive simulation approach assumes that the
sampling distributions of the parameters are multivariate normal, which
is likely to be questionable in a low-information setting (which will be
the case if you don't have very many non-zero values ...)


On 10/10/20 7:28 AM, Alessandra Bielli wrote:

Dear list,

I am trying to predict a value and CI for two different treatments from a
glmmTMB fitted model using posterior predictive simulations (mvrnorm
function in the MASS package) as in the Salamander example, Brooks 2017
appendix B
<https://www.biorxiv.org/content/biorxiv/suppl/2017/05/01/132753.DC1/132753-2.pdf>.
The dependent variable is a count, majority of values are zeros but some
positive values appear.

m1 <- glmmTMB(Dolphins.TOT ~ Used_piNgers + offset(log(Effort)) +
(1|Trip_Code), ziformula =~1,
                data=x, family = "truncated_poisson")

      newdata0 = with(x,
                  expand.grid(
                    Used_piNgers = c("Y","N"),
                    Effort=1))

X.cond = model.matrix(lme4::nobars(formula(m1)[-2]), newdata0)

beta.cond = fixef(m1)$cond
pred.cond = X.cond %*% beta.cond
ziformula = m1$modelInfo$allForm$ziformula
X.zi = model.matrix(lme4::nobars(ziformula), newdata0)
beta.zi = fixef(m1)$zi
pred.zi = X.zi %*% beta.zi

pred.ucount = exp(pred.cond)*(1-plogis(pred.zi))


set.seed(101)
pred.condpar.psim = mvrnorm(1000,mu=beta.cond,Sigma=vcov(m1)$cond)
pred.cond.psim = X.cond %*% t(pred.condpar.psim)
pred.zipar.psim = mvrnorm(1000,mu=beta.zi,Sigma=vcov(m1)$zi)
pred.zi.psim = X.zi %*% t(pred.zipar.psim)
pred.ucount.psim = exp(pred.cond.psim)*(1-plogis(pred.zi.psim))
ci.ucount = t(apply(pred.ucount.psim,1,quantile,c(0.025,0.975)))
ci.ucount = data.frame(ci.ucount)
names(ci.ucount) = c("ucount.low","ucount.high")
pred.ucount = data.frame(newdata0, pred.ucount, ci.ucount)

For my upper CI, I get a value equal to Inf:

Used_piNgers Effort  pred.ucount ucount.low ucount.high
1            Y      1 6.758889e-11 0.00000000         Inf
2            N      1 1.575418e-02 0.00223033   0.1096139

Is the Inf caused by the very low variability of values in my dataset? I
tried to lower the upper bound of the CI ci.ucount =
t(apply(pred.ucount.psim,1,quantile,c(0.025,0.975))) and only when reaching
0.475 ci.ucount = t(apply(pred.ucount.psim,1,quantile,c(0.025,0.475))) I
obtained:

Used_piNgers Effort  pred.ucount ucount.low  ucount.high
1            Y      1 6.758889e-11 0.00000000 7.117465e+12
2            N      1 1.575418e-02 0.00223033 1.454579e-02

I found a related post
<https://stackoverflow.com/questions/38272798/bootstrap-confidence-interval-with-inf-in-final-estimates-boot-dplyr-package>
but
the explanation is not clear to me. I would like to publish these results
and I would like to know:

     1. is this a sign that something is wrong? if yes, what is it?
     2. if nothing is wrong, what does the Inf mean and what's the best way
     to report it and plot it in a publication?


I also posted this question on Cross validated
https://stats.stackexchange.com/questions/491196/bootstrap-confidence-interval-with-mvrnorm-upper-value-equal-to-inf

Thanks,
Alessandra

       [[alternative HTML version deleted]]

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

confidence intervals with mvrnorm - upper value equal to inf

Thread (8 messages)