model check for negative binomial model - R-SIG-mixed-models

Tue, Feb 11, 2020 5:09 PM #

Dear list

I am fitting a poisson model to estimate the effect of a treatment on
emergence success of hatchlings. To estimate emergence success, I use
number of emerged and an offset(log(total clutch).

However, overdispersion was detected:

chisq       ratio         rdf           p
3490.300836    5.684529  614.000000    0.000000

Therefore, I switched to a negative binomial. I know overdispersion is not
relevant for nb models, but the model plots don't look too good. I also
tried to fit a poisson model with OLRE, but still the  plots don't look
good.
How do I know if my model is good enough, and what can I do to improve it?

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(7.604)  ( log )
Formula: Hatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch)) + (1
|Beach_ID) + (1 | Year)
   Data: main

     AIC      BIC   logLik deviance df.resid
  6015.6   6042.2  -3001.8   6003.6      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.6427 -0.3790  0.1790  0.5242  1.6583

Random effects:
 Groups   Name        Variance Std.Dev.
 Beach_ID (Intercept) 0.004438 0.06662
 Year     (Intercept) 0.001640 0.04050
Number of obs: 620, groups:  Beach_ID, 8; Year, 5

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -0.29915    0.04055  -7.377 1.62e-13 ***
Relocation..Y.N.Y -0.16402    0.05052  -3.247  0.00117 **
SPL               -0.08311    0.04365  -1.904  0.05689 .
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.114
SPL         -0.497 -0.054


Thanks for your help,

Alessandra

-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 65277 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20200211/4946bfa0/attachment-0001.png>

Ben Bolker

Tue, Feb 11, 2020 5:29 PM #

Short answer: if emergence success gets much above 50%, then the
approximation you're making (Poisson + offset for binomial, or NB +
offset for negative binomial) doesn't work well.  You might try a
beta-binomial (with glmmTMB) or a binomial + an observation-level random
effect.

  (On the other hand, your intercept is -0.3, which corresponds to a
baseline emergence of 0.42 - not *very* high (but some beaches and years
will be well above that ...)

  Beyond that, are there any obvious patterns of mis-fit in the
predicted values ... ?

On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Alessandra Bielli

Wed, Feb 12, 2020 9:42 AM #

Dear Ben

Thanks for your quick response.

Yes, emergence success is usually between 60 and 80% or higher.
I am not sure how to use a binomial, if my data are counts?

Can you explain why the approximation doesn't work well if success gets
much above 50%? Does it make sense, then, to have "unhatched" as dependent
variable, so that I predict mortality (usually below 50%) using a nb with
offset(log(total clutch)) ?

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(2.2104)  ( log )
Formula: Unhatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch)) +
   (1 | Beach_ID) + (1 | Week)
   Data: main

     AIC      BIC   logLik deviance df.resid
  5439.4   5466.0  -2713.7   5427.4      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-1.4383 -0.7242 -0.2287  0.4866  4.0531

Random effects:
 Groups   Name        Variance Std.Dev.
 Week     (Intercept) 0.003092 0.0556
 Beach_ID (Intercept) 0.025894 0.1609
Number of obs: 620, groups:  Week, 31; Beach_ID, 8

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -1.38864    0.08227 -16.879  < 2e-16 ***
Relocation..Y.N.Y  0.32105    0.09152   3.508 0.000452 ***
SPL                0.22218    0.08793   2.527 0.011508 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.143
SPL         -0.540 -0.038

Thanks,

Alessandra

On Tue, Feb 11, 2020 at 7:29 PM Ben Bolker <bbolker at gmail.com> wrote:

  Short answer: if emergence success gets much above 50%, then the
approximation you're making (Poisson + offset for binomial, or NB +
offset for negative binomial) doesn't work well.  You might try a
beta-binomial (with glmmTMB) or a binomial + an observation-level random
effect.

  (On the other hand, your intercept is -0.3, which corresponds to a
baseline emergence of 0.42 - not *very* high (but some beaches and years
will be well above that ...)

  Beyond that, are there any obvious patterns of mis-fit in the
predicted values ... ?

On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:

Dear list

I am fitting a poisson model to estimate the effect of a treatment on
emergence success of hatchlings. To estimate emergence success, I use
number of emerged and an offset(log(total clutch).

However, overdispersion was detected:

overdisp_fun(m.emerged) #overdispersion detected

      chisq       ratio         rdf           p
3490.300836    5.684529  614.000000    0.000000

Therefore, I switched to a negative binomial. I know overdispersion is

not

relevant for nb models, but the model plots don't look too good. I also
tried to fit a poisson model with OLRE, but still the  plots don't look
good.
How do I know if my model is good enough, and what can I do to improve

it?

summary(m.emerged)

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(7.604)  ( log )
Formula: Hatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch)) + (1
|Beach_ID) + (1 | Year)
   Data: main

     AIC      BIC   logLik deviance df.resid
  6015.6   6042.2  -3001.8   6003.6      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.6427 -0.3790  0.1790  0.5242  1.6583

Random effects:
 Groups   Name        Variance Std.Dev.
 Beach_ID (Intercept) 0.004438 0.06662
 Year     (Intercept) 0.001640 0.04050
Number of obs: 620, groups:  Beach_ID, 8; Year, 5

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -0.29915    0.04055  -7.377 1.62e-13 ***
Relocation..Y.N.Y -0.16402    0.05052  -3.247  0.00117 **
SPL               -0.08311    0.04365  -1.904  0.05689 .
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.114
SPL         -0.497 -0.054


Thanks for your help,

Alessandra

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

-------------- next part --------------
A non-text attachment was scrubbed...
Name: Rplot.pdf
Type: application/pdf
Size: 58626 bytes
Desc: not available
URL: <https://stat.ethz.ch/pipermail/r-sig-mixed-models/attachments/20200212/ccb30a6c/attachment-0001.pdf>

Thierry Onkelinx

Mon, Feb 17, 2020 5:18 AM #

Dear Alessandra,

Since you have both the number hatched and the total clutch size you can
calculate the number of successes and failures. That is sufficient for a
binomial distribution.

glmer(cbind(Hatched, Unhatched) ~ Relocation..Y.N. + SP + (1 | Beach_ID) +
(1 | Week), family = binmial)

A negative binomial or Poisson allow predictions larger than the offset.
Which is nonsense given that the number hatched cannot surpass the total
clutch size.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op wo 12 feb. 2020 om 18:42 schreef Alessandra Bielli <
bielli.alessandra at gmail.com>:

Dear Ben

Thanks for your quick response.

Yes, emergence success is usually between 60 and 80% or higher.
I am not sure how to use a binomial, if my data are counts?

Can you explain why the approximation doesn't work well if success gets
much above 50%? Does it make sense, then, to have "unhatched" as dependent
variable, so that I predict mortality (usually below 50%) using a nb with
offset(log(total clutch)) ?

summary(m.emerged)

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(2.2104)  ( log )
Formula: Unhatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch)) +
   (1 | Beach_ID) + (1 | Week)
   Data: main

     AIC      BIC   logLik deviance df.resid
  5439.4   5466.0  -2713.7   5427.4      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-1.4383 -0.7242 -0.2287  0.4866  4.0531

Random effects:
 Groups   Name        Variance Std.Dev.
 Week     (Intercept) 0.003092 0.0556
 Beach_ID (Intercept) 0.025894 0.1609
Number of obs: 620, groups:  Week, 31; Beach_ID, 8

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -1.38864    0.08227 -16.879  < 2e-16 ***
Relocation..Y.N.Y  0.32105    0.09152   3.508 0.000452 ***
SPL                0.22218    0.08793   2.527 0.011508 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.143
SPL         -0.540 -0.038

Thanks,

Alessandra

On Tue, Feb 11, 2020 at 7:29 PM Ben Bolker <bbolker at gmail.com> wrote:

  Short answer: if emergence success gets much above 50%, then the
approximation you're making (Poisson + offset for binomial, or NB +
offset for negative binomial) doesn't work well.  You might try a
beta-binomial (with glmmTMB) or a binomial + an observation-level random
effect.

  (On the other hand, your intercept is -0.3, which corresponds to a
baseline emergence of 0.42 - not *very* high (but some beaches and years
will be well above that ...)

  Beyond that, are there any obvious patterns of mis-fit in the
predicted values ... ?

On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:

Dear list

I am fitting a poisson model to estimate the effect of a treatment on
emergence success of hatchlings. To estimate emergence success, I use
number of emerged and an offset(log(total clutch).

However, overdispersion was detected:

overdisp_fun(m.emerged) #overdispersion detected

      chisq       ratio         rdf           p
3490.300836    5.684529  614.000000    0.000000

Therefore, I switched to a negative binomial. I know overdispersion is

not

relevant for nb models, but the model plots don't look too good. I also
tried to fit a poisson model with OLRE, but still the  plots don't look
good.
How do I know if my model is good enough, and what can I do to improve

it?

summary(m.emerged)

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(7.604)  ( log )
Formula: Hatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch)) +

(1

|Beach_ID) + (1 | Year)
   Data: main

     AIC      BIC   logLik deviance df.resid
  6015.6   6042.2  -3001.8   6003.6      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.6427 -0.3790  0.1790  0.5242  1.6583

Random effects:
 Groups   Name        Variance Std.Dev.
 Beach_ID (Intercept) 0.004438 0.06662
 Year     (Intercept) 0.001640 0.04050
Number of obs: 620, groups:  Beach_ID, 8; Year, 5

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -0.29915    0.04055  -7.377 1.62e-13 ***
Relocation..Y.N.Y -0.16402    0.05052  -3.247  0.00117 **
SPL               -0.08311    0.04365  -1.904  0.05689 .
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.114
SPL         -0.497 -0.054


Thanks for your help,

Alessandra

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Alessandra Bielli

Mon, Feb 17, 2020 10:48 AM #

Dear Thierry

Thanks for your reply.

I read a bit about the prediction for a binomial model with
success/failures and I have a couple of questions.

If I use the predict function with the model you recommended, I obtain
log.odds or probabilities if I use "type=response":

tapply(predict(m.unhatched,type="response"),list(main$SP,main$Relocation..Y.N.),mean)
          N         Y
G 0.7314196 0.6414554
L 0.6983576 0.6003087

Are these probabilities of success (i.e. hatched) in one nest?

Thanks,

Alessandra

On Mon, Feb 17, 2020 at 7:18 AM Thierry Onkelinx <thierry.onkelinx at inbo.be>
wrote:

Dear Alessandra,

Since you have both the number hatched and the total clutch size you can
calculate the number of successes and failures. That is sufficient for a
binomial distribution.

glmer(cbind(Hatched, Unhatched) ~ Relocation..Y.N. + SP + (1 | Beach_ID)
+ (1 | Week), family = binmial)

A negative binomial or Poisson allow predictions larger than the offset.
Which is nonsense given that the number hatched cannot surpass the total
clutch size.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

Op wo 12 feb. 2020 om 18:42 schreef Alessandra Bielli <
bielli.alessandra at gmail.com>:

Dear Ben

Thanks for your quick response.

Yes, emergence success is usually between 60 and 80% or higher.
I am not sure how to use a binomial, if my data are counts?

Can you explain why the approximation doesn't work well if success gets
much above 50%? Does it make sense, then, to have "unhatched" as dependent
variable, so that I predict mortality (usually below 50%) using a nb with
offset(log(total clutch)) ?

summary(m.emerged)

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(2.2104)  ( log )
Formula: Unhatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch)) +
   (1 | Beach_ID) + (1 | Week)
   Data: main

     AIC      BIC   logLik deviance df.resid
  5439.4   5466.0  -2713.7   5427.4      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-1.4383 -0.7242 -0.2287  0.4866  4.0531

Random effects:
 Groups   Name        Variance Std.Dev.
 Week     (Intercept) 0.003092 0.0556
 Beach_ID (Intercept) 0.025894 0.1609
Number of obs: 620, groups:  Week, 31; Beach_ID, 8

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -1.38864    0.08227 -16.879  < 2e-16 ***
Relocation..Y.N.Y  0.32105    0.09152   3.508 0.000452 ***
SPL                0.22218    0.08793   2.527 0.011508 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.143
SPL         -0.540 -0.038

Thanks,

Alessandra

On Tue, Feb 11, 2020 at 7:29 PM Ben Bolker <bbolker at gmail.com> wrote:

  Short answer: if emergence success gets much above 50%, then the
approximation you're making (Poisson + offset for binomial, or NB +
offset for negative binomial) doesn't work well.  You might try a
beta-binomial (with glmmTMB) or a binomial + an observation-level random
effect.

  (On the other hand, your intercept is -0.3, which corresponds to a
baseline emergence of 0.42 - not *very* high (but some beaches and years
will be well above that ...)

  Beyond that, are there any obvious patterns of mis-fit in the
predicted values ... ?

On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:

Dear list

I am fitting a poisson model to estimate the effect of a treatment on
emergence success of hatchlings. To estimate emergence success, I use
number of emerged and an offset(log(total clutch).

However, overdispersion was detected:

overdisp_fun(m.emerged) #overdispersion detected

      chisq       ratio         rdf           p
3490.300836    5.684529  614.000000    0.000000

Therefore, I switched to a negative binomial. I know overdispersion is

not

relevant for nb models, but the model plots don't look too good. I

also

tried to fit a poisson model with OLRE, but still the  plots don't

look

good.
How do I know if my model is good enough, and what can I do to improve

it?

summary(m.emerged)

Generalized linear mixed model fit by maximum likelihood (Laplace
Approximation) ['glmerMod']
 Family: Negative Binomial(7.604)  ( log )
Formula: Hatched ~ Relocation..Y.N. + SP + offset(log(Total_Clutch))

+ (1

|Beach_ID) + (1 | Year)
   Data: main

     AIC      BIC   logLik deviance df.resid
  6015.6   6042.2  -3001.8   6003.6      614

Scaled residuals:
    Min      1Q  Median      3Q     Max
-2.6427 -0.3790  0.1790  0.5242  1.6583

Random effects:
 Groups   Name        Variance Std.Dev.
 Beach_ID (Intercept) 0.004438 0.06662
 Year     (Intercept) 0.001640 0.04050
Number of obs: 620, groups:  Beach_ID, 8; Year, 5

Fixed effects:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)       -0.29915    0.04055  -7.377 1.62e-13 ***
Relocation..Y.N.Y -0.16402    0.05052  -3.247  0.00117 **
SPL               -0.08311    0.04365  -1.904  0.05689 .
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Correlation of Fixed Effects:
            (Intr) R..Y.N
Rlct..Y.N.Y -0.114
SPL         -0.497 -0.054


Thanks for your help,

Alessandra

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Ben Bolker

Mon, Feb 17, 2020 12:15 PM #

That's correct.  There are some delicate issues about prediction:

* do you want to use the original (potentially unbalanced) data for
prediction? (That's what you're doing here).
* or, do you want to make predictions for a "typical" nest and week
combination, in which case you would use


  pframe <- with(your_data,
        expand.grid(Relocation..Y.N.=unique(Relocation..Y.N.),
            SP=unique(SP))
  predict(m.unhatched,type="response",re.form=NA,newdata=pframe))

  You could also use expand.grid() to generate a balanced design (i.e.
all combinations of weeks and nests), which would give yet another answer.

  There are a lot of packages designed for doing these kinds of
post-fitting manipulations (e.g. 'margins', 'emmeans'), you might find
them useful ...

On 2020-02-17 1:48 p.m., Alessandra Bielli wrote:

Dear Thierry

Thanks for your reply.

I read a bit about the prediction for a binomial model with
success/failures and I have a couple of questions.?

If I use the predict function with the model you recommended, I obtain
log.odds or probabilities if I use "type=response":

tapply(predict(m.unhatched,type="response"),list(main$SP,main$Relocation..Y.N.),mean)
? ? ? ? ? N ? ? ? ? Y
G 0.7314196 0.6414554
L 0.6983576 0.6003087

Are these probabilities of success (i.e. hatched) in one nest??

Thanks,

Alessandra

On Mon, Feb 17, 2020 at 7:18 AM Thierry Onkelinx
<thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>> wrote:

    Dear Alessandra,

    Since you have both the number hatched and the total clutch size you
    can calculate the number of successes and failures. That is
    sufficient for a binomial distribution.

    glmer(cbind(Hatched, Unhatched) ~ Relocation..Y.N. + SP + (1 |
    Beach_ID) + (1 | Week), family = binmial)

    A negative binomial or Poisson allow predictions larger than the
    offset. Which is nonsense given that the number hatched cannot
    surpass the total clutch size.

    Best regards,

    ir. Thierry Onkelinx
    Statisticus / Statistician

    Vlaamse Overheid / Government of Flanders
    INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR
    NATURE AND FOREST
    Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
    thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>
    Havenlaan 88 bus 73, 1000 Brussel
    www.inbo.be <http://www.inbo.be>

    ///////////////////////////////////////////////////////////////////////////////////////////
    To call in the statistician after the experiment is done may be no
    more than asking him to perform a post-mortem examination: he may be
    able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
    The plural of anecdote is not data. ~ Roger Brinner
    The combination of some data and an aching desire for an answer does
    not ensure that a reasonable answer can be extracted from a given
    body of data. ~ John Tukey
    ///////////////////////////////////////////////////////////////////////////////////////////

    <https://www.inbo.be>


    Op wo 12 feb. 2020 om 18:42 schreef Alessandra Bielli
    <bielli.alessandra at gmail.com <mailto:bielli.alessandra at gmail.com>>:

        Dear Ben

        Thanks for your quick response.

        Yes, emergence success is usually between 60 and 80% or higher.
        I am not sure how to use a binomial, if my data are counts?

        Can you explain why the approximation doesn't work well if
        success gets
        much above 50%? Does it make sense, then, to have "unhatched" as
        dependent
        variable, so that I predict mortality (usually below 50%) using
        a nb with
        offset(log(total clutch)) ?

        > summary(m.emerged)

        Generalized linear mixed model fit by maximum likelihood (Laplace
        Approximation) ['glmerMod']
        ?Family: Negative Binomial(2.2104)? ( log )
        Formula: Unhatched ~ Relocation..Y.N. + SP +
        offset(log(Total_Clutch)) +
        ? ?(1 | Beach_ID) + (1 | Week)
        ? ?Data: main

        ? ? ?AIC? ? ? BIC? ?logLik deviance df.resid
        ? 5439.4? ?5466.0? -2713.7? ?5427.4? ? ? 614

        Scaled residuals:
        ? ? Min? ? ? 1Q? Median? ? ? 3Q? ? ?Max
        -1.4383 -0.7242 -0.2287? 0.4866? 4.0531

        Random effects:
        ?Groups? ?Name? ? ? ? Variance Std.Dev.
        ?Week? ? ?(Intercept) 0.003092 0.0556
        ?Beach_ID (Intercept) 0.025894 0.1609
        Number of obs: 620, groups:? Week, 31; Beach_ID, 8

        Fixed effects:
        ? ? ? ? ? ? ? ? ? Estimate Std. Error z value Pr(>|z|)
        (Intercept)? ? ? ?-1.38864? ? 0.08227 -16.879? < 2e-16 ***
        Relocation..Y.N.Y? 0.32105? ? 0.09152? ?3.508 0.000452 ***
        SPL? ? ? ? ? ? ? ? 0.22218? ? 0.08793? ?2.527 0.011508 *
        ---
        Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

        Correlation of Fixed Effects:
        ? ? ? ? ? ? (Intr) R..Y.N
        Rlct..Y.N.Y -0.143
        SPL? ? ? ? ?-0.540 -0.038

        Thanks,

        Alessandra

        On Tue, Feb 11, 2020 at 7:29 PM Ben Bolker <bbolker at gmail.com
        <mailto:bbolker at gmail.com>> wrote:

        >
        >? ?Short answer: if emergence success gets much above 50%, then the
        > approximation you're making (Poisson + offset for binomial, or

        NB +

        > offset for negative binomial) doesn't work well.? You might try a
        > beta-binomial (with glmmTMB) or a binomial + an

        observation-level random

        > effect.
        >
        >? ?(On the other hand, your intercept is -0.3, which

        corresponds to a

        > baseline emergence of 0.42 - not *very* high (but some beaches

        and years

        > will be well above that ...)
        >
        >? ?Beyond that, are there any obvious patterns of mis-fit in the
        > predicted values ... ?
        >
        > On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:

        > > Dear list
        > >
        > > I am fitting a poisson model to estimate the effect of a

        treatment on

        > > emergence success of hatchlings. To estimate emergence

        success, I use

        > > number of emerged and an offset(log(total clutch).
        > >
        > > However, overdispersion was detected:
        > >

        > >> overdisp_fun(m.emerged) #overdispersion detected

        > >
        > >? ? ? ?chisq? ? ? ?ratio? ? ? ? ?rdf? ? ? ? ? ?p
        > > 3490.300836? ? 5.684529? 614.000000? ? 0.000000
        > >
        > > Therefore, I switched to a negative binomial. I know

        overdispersion is

        > not

        > > relevant for nb models, but the model plots don't look too

        good. I also

        > > tried to fit a poisson model with OLRE, but still the? plots

        don't look

        > > good.
        > > How do I know if my model is good enough, and what can I do

        to improve

        > it?

> >

        > >> summary(m.emerged)

        > > Generalized linear mixed model fit by maximum likelihood

        (Laplace

        > > Approximation) ['glmerMod']
        > >? Family: Negative Binomial(7.604)? ( log )
        > > Formula: Hatched ~ Relocation..Y.N. + SP +

        offset(log(Total_Clutch)) + (1

        > > |Beach_ID) + (1 | Year)
        > >? ? Data: main
        > >
        > >? ? ? AIC? ? ? BIC? ?logLik deviance df.resid
        > >? ?6015.6? ?6042.2? -3001.8? ?6003.6? ? ? 614
        > >
        > > Scaled residuals:
        > >? ? ?Min? ? ? 1Q? Median? ? ? 3Q? ? ?Max
        > > -2.6427 -0.3790? 0.1790? 0.5242? 1.6583
        > >
        > > Random effects:
        > >? Groups? ?Name? ? ? ? Variance Std.Dev.
        > >? Beach_ID (Intercept) 0.004438 0.06662
        > >? Year? ? ?(Intercept) 0.001640 0.04050
        > > Number of obs: 620, groups:? Beach_ID, 8; Year, 5
        > >
        > > Fixed effects:
        > >? ? ? ? ? ? ? ? ? ?Estimate Std. Error z value Pr(>|z|)
        > > (Intercept)? ? ? ?-0.29915? ? 0.04055? -7.377 1.62e-13 ***
        > > Relocation..Y.N.Y -0.16402? ? 0.05052? -3.247? 0.00117 **
        > > SPL? ? ? ? ? ? ? ?-0.08311? ? 0.04365? -1.904? 0.05689 .
        > > ---
        > > Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1
        > >
        > > Correlation of Fixed Effects:
        > >? ? ? ? ? ? ?(Intr) R..Y.N
        > > Rlct..Y.N.Y -0.114
        > > SPL? ? ? ? ?-0.497 -0.054
        > >
        > >
        > > Thanks for your help,
        > >
        > > Alessandra
        > >
        > >
        > > _______________________________________________
        > > R-sig-mixed-models at r-project.org

        <mailto:R-sig-mixed-models at r-project.org> mailing list

        > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
        > >

        >
        > _______________________________________________
        > R-sig-mixed-models at r-project.org

        <mailto:R-sig-mixed-models at r-project.org> mailing list

        > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
        >

        _______________________________________________
        R-sig-mixed-models at r-project.org
        <mailto:R-sig-mixed-models at r-project.org> mailing list
        https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

Alessandra Bielli

Mon, Feb 17, 2020 4:01 PM #

Dear Ben

I am trying to make a prediction for the combination of species (L or G)
and treatment (control/experiment).

I am still confused about the prediction values. I would like to present
results as a success rate for a nest, to say that treatment
increases/decreases success by ...%. But the value I have is the
probability that 1 egg in the nest succeeds, correct? I am not sure how to
use these predictions.

Thanks for your help!

Alessandra

On Mon, Feb 17, 2020 at 2:15 PM Ben Bolker <bbolker at gmail.com> wrote:

    That's correct.  There are some delicate issues about prediction:

* do you want to use the original (potentially unbalanced) data for
prediction? (That's what you're doing here).
* or, do you want to make predictions for a "typical" nest and week
combination, in which case you would use


  pframe <- with(your_data,
        expand.grid(Relocation..Y.N.=unique(Relocation..Y.N.),
            SP=unique(SP))
  predict(m.unhatched,type="response",re.form=NA,newdata=pframe))

  You could also use expand.grid() to generate a balanced design (i.e.
all combinations of weeks and nests), which would give yet another answer.

  There are a lot of packages designed for doing these kinds of
post-fitting manipulations (e.g. 'margins', 'emmeans'), you might find
them useful ...



On 2020-02-17 1:48 p.m., Alessandra Bielli wrote:

Dear Thierry

Thanks for your reply.

I read a bit about the prediction for a binomial model with
success/failures and I have a couple of questions.

If I use the predict function with the model you recommended, I obtain
log.odds or probabilities if I use "type=response":

tapply(predict(m.unhatched,type="response"),list(main$SP,main$Relocation..Y.N.),mean)

          N         Y
G 0.7314196 0.6414554
L 0.6983576 0.6003087

Are these probabilities of success (i.e. hatched) in one nest?

Thanks,

Alessandra

On Mon, Feb 17, 2020 at 7:18 AM Thierry Onkelinx
<thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>> wrote:

    Dear Alessandra,

    Since you have both the number hatched and the total clutch size you
    can calculate the number of successes and failures. That is
    sufficient for a binomial distribution.

    glmer(cbind(Hatched, Unhatched) ~ Relocation..Y.N. + SP + (1 |
    Beach_ID) + (1 | Week), family = binmial)

    A negative binomial or Poisson allow predictions larger than the
    offset. Which is nonsense given that the number hatched cannot
    surpass the total clutch size.

    Best regards,

    ir. Thierry Onkelinx
    Statisticus / Statistician

    Vlaamse Overheid / Government of Flanders
    INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR
    NATURE AND FOREST
    Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
    thierry.onkelinx at inbo.be <mailto:thierry.onkelinx at inbo.be>
    Havenlaan 88 bus 73, 1000 Brussel
    www.inbo.be <http://www.inbo.be>

 ///////////////////////////////////////////////////////////////////////////////////////////

    To call in the statistician after the experiment is done may be no
    more than asking him to perform a post-mortem examination: he may be
    able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
    The plural of anecdote is not data. ~ Roger Brinner
    The combination of some data and an aching desire for an answer does
    not ensure that a reasonable answer can be extracted from a given
    body of data. ~ John Tukey

 ///////////////////////////////////////////////////////////////////////////////////////////

    <https://www.inbo.be>


    Op wo 12 feb. 2020 om 18:42 schreef Alessandra Bielli
    <bielli.alessandra at gmail.com <mailto:bielli.alessandra at gmail.com>>:

        Dear Ben

        Thanks for your quick response.

        Yes, emergence success is usually between 60 and 80% or higher.
        I am not sure how to use a binomial, if my data are counts?

        Can you explain why the approximation doesn't work well if
        success gets
        much above 50%? Does it make sense, then, to have "unhatched" as
        dependent
        variable, so that I predict mortality (usually below 50%) using
        a nb with
        offset(log(total clutch)) ?

        > summary(m.emerged)

        Generalized linear mixed model fit by maximum likelihood (Laplace
        Approximation) ['glmerMod']
         Family: Negative Binomial(2.2104)  ( log )
        Formula: Unhatched ~ Relocation..Y.N. + SP +
        offset(log(Total_Clutch)) +
           (1 | Beach_ID) + (1 | Week)
           Data: main

             AIC      BIC   logLik deviance df.resid
          5439.4   5466.0  -2713.7   5427.4      614

        Scaled residuals:
            Min      1Q  Median      3Q     Max
        -1.4383 -0.7242 -0.2287  0.4866  4.0531

        Random effects:
         Groups   Name        Variance Std.Dev.
         Week     (Intercept) 0.003092 0.0556
         Beach_ID (Intercept) 0.025894 0.1609
        Number of obs: 620, groups:  Week, 31; Beach_ID, 8

        Fixed effects:
                          Estimate Std. Error z value Pr(>|z|)
        (Intercept)       -1.38864    0.08227 -16.879  < 2e-16 ***
        Relocation..Y.N.Y  0.32105    0.09152   3.508 0.000452 ***
        SPL                0.22218    0.08793   2.527 0.011508 *
        ---
        Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

        Correlation of Fixed Effects:
                    (Intr) R..Y.N
        Rlct..Y.N.Y -0.143
        SPL         -0.540 -0.038

        Thanks,

        Alessandra

        On Tue, Feb 11, 2020 at 7:29 PM Ben Bolker <bbolker at gmail.com
        <mailto:bbolker at gmail.com>> wrote:

        >
        >   Short answer: if emergence success gets much above 50%, then

the

        > approximation you're making (Poisson + offset for binomial, or

        NB +

        > offset for negative binomial) doesn't work well.  You might

try a

        > beta-binomial (with glmmTMB) or a binomial + an

        observation-level random

        > effect.
        >
        >   (On the other hand, your intercept is -0.3, which

        corresponds to a

        > baseline emergence of 0.42 - not *very* high (but some beaches

        and years

        > will be well above that ...)
        >
        >   Beyond that, are there any obvious patterns of mis-fit in the
        > predicted values ... ?
        >
        > On 2020-02-11 8:09 p.m., Alessandra Bielli wrote:

        > > Dear list
        > >
        > > I am fitting a poisson model to estimate the effect of a

        treatment on

        > > emergence success of hatchlings. To estimate emergence

        success, I use

        > > number of emerged and an offset(log(total clutch).
        > >
        > > However, overdispersion was detected:
        > >

        > >> overdisp_fun(m.emerged) #overdispersion detected

        > >
        > >       chisq       ratio         rdf           p
        > > 3490.300836    5.684529  614.000000    0.000000
        > >
        > > Therefore, I switched to a negative binomial. I know

        overdispersion is

        > not

        > > relevant for nb models, but the model plots don't look too

        good. I also

        > > tried to fit a poisson model with OLRE, but still the  plots

        don't look

        > > good.
        > > How do I know if my model is good enough, and what can I do

        to improve

        > it?

> >

        > >> summary(m.emerged)

        > > Generalized linear mixed model fit by maximum likelihood

        (Laplace

        > > Approximation) ['glmerMod']
        > >  Family: Negative Binomial(7.604)  ( log )
        > > Formula: Hatched ~ Relocation..Y.N. + SP +

        offset(log(Total_Clutch)) + (1

        > > |Beach_ID) + (1 | Year)
        > >    Data: main
        > >
        > >      AIC      BIC   logLik deviance df.resid
        > >   6015.6   6042.2  -3001.8   6003.6      614
        > >
        > > Scaled residuals:
        > >     Min      1Q  Median      3Q     Max
        > > -2.6427 -0.3790  0.1790  0.5242  1.6583
        > >
        > > Random effects:
        > >  Groups   Name        Variance Std.Dev.
        > >  Beach_ID (Intercept) 0.004438 0.06662
        > >  Year     (Intercept) 0.001640 0.04050
        > > Number of obs: 620, groups:  Beach_ID, 8; Year, 5
        > >
        > > Fixed effects:
        > >                   Estimate Std. Error z value Pr(>|z|)
        > > (Intercept)       -0.29915    0.04055  -7.377 1.62e-13 ***
        > > Relocation..Y.N.Y -0.16402    0.05052  -3.247  0.00117 **
        > > SPL               -0.08311    0.04365  -1.904  0.05689 .
        > > ---
        > > Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ?

        > >
        > > Correlation of Fixed Effects:
        > >             (Intr) R..Y.N
        > > Rlct..Y.N.Y -0.114
        > > SPL         -0.497 -0.054
        > >
        > >
        > > Thanks for your help,
        > >
        > > Alessandra
        > >
        > >
        > > _______________________________________________
        > > R-sig-mixed-models at r-project.org

        <mailto:R-sig-mixed-models at r-project.org> mailing list

        > > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
        > >

        >
        > _______________________________________________
        > R-sig-mixed-models at r-project.org

        <mailto:R-sig-mixed-models at r-project.org> mailing list

        > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
        >

        _______________________________________________
        R-sig-mixed-models at r-project.org
        <mailto:R-sig-mixed-models at r-project.org> mailing list
        https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models