Skip to content

Question about proportion data in binomial glmm

5 messages · Thierry Onkelinx, Mollie Brooks, Ben Bolker +1 more

#
I have a question about how glmmtmb handles proportion data for the
purposes of a binomial glmm.

I combined my success and failure count data into a matrix using cbind(),
and used that as my response in my binomial glmm using glmmtmb.

However, despite there being a few instances of zero counts in both columns
and therefore an undefined proportion, the model doesn't seem to drop these
rows from my data set.

I don't get any errors or warnings when running the model, but I worry my
results might be biased because of this.

My question is: Is glmmtmb doing something like adding a tiny amount to
each value of my response in order to avoid dealing with undefined
proportion data?

Thank you for your help,

Robert
#
Dear Robert,

IMHO you should remove the cbind(0, 0) before fitting the model. There is
no reason to keep them in the dataset.

Best regards,

ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>


Op vr 24 mrt 2023 om 02:39 schreef rtfiner <rtfiner at gmail.com>:

  
  
#
They have zero contribution to the log-likelihood, so they shouldn?t affect the model.
[1] 0

I can?t say if they would affect any model evaluation functionality, but they shouldn't.

Best,
Mollie
3 days later
#
?? The only further issue here is that the number of observations for 
the model will still be computed as including these null values. This 
should only matter if you're doing something like computing 
finite-size-corrected AICs (and to paraphrase _Numerical Recipes_, if 
this level of difference matters to you then you're on shaky ground 
anyway ...)

 ? The source code for the dbinom implementation in TMB:

https://kaskr.github.io/adcomp/distributions__R_8hpp_source.html

 ? illustrates that values with N=0, k = 0 will have no effect on the 
log-likelihood (while TMB mirrors R's behaviour most of the time, it's 
not 100% safe to assume that edge cases will work exactly the same in R 
and TMB)
On 2023-03-24 6:36 a.m., Mollie Brooks wrote:
2 days later
#
Thank you all for your input. I tried fitting the model with NaNs removed,
and output and evaluation were very similar, so perhaps I am okay?

-Robert
On Mon, Mar 27, 2023 at 9:18?AM Ben Bolker <bbolker at gmail.com> wrote: