Skip to content

Repeated-measures analysis with count data following a negative-binomial distribution

3 messages · VINCENT KOPPELMANS, Thierry Onkelinx, Paul Johnson

#
Dear all,

I am looking for advice on how to run a repeated analysis on count data.

My issue is as follows:

  *   We counted the number of carotid plaques from ultrasound images in a diseased population and a group of control subjects
  *   We have measured the plaques for all subjects in both the left and the right carotid artery during the same session
  *   The number of plaques is a count score ranging from 0 to 6
  *   The distributions look like this:
     *   Plaques in the left carotid artery: https://www.dropbox.com/s/t5tqh4wjrfc5eml/Left.png?dl=0
     *   Plaques in the right carotid artery: https://www.dropbox.com/s/nl5ezef145av2ae/Right.png?dl=0
     *   (Where NKI= the diseased population; RSS= the control subjects; (all)= the two groups combined. There are no numbers on the x-axis, but the 7 columns are the count scores 0-6 (left to right).)
  *   There is biological evidence that the distribution of plaques for left and right differ in the general population (i.e., our control subjects).
  *   I would like to test if the difference in distribution of plaque scores between the left and right carotid arteries is different between my two populations.
  *   Previous analyses (e.g., comparing a single side between groups) showed me that a negative binomial distribution is a better fit for my data than a Poisson distribution.

My idea is to run a repeated-measures negative binomial regression analysis where plaque score measures (left and right) would be the repeated measures. In this case I would be interested in the ?body side? by group interaction.

My questions are:

  *   Is a good and valid approach?
  *   I am thinking about using R?s GEE package (https://cran.r-project.org/web/packages/gee/index.html). Would that be the right tool for this job?

Thanks!

- Vincent
#
Dear Vincent,

A mixed model with subject as random intercept is recommended. You need to
think about the counts. Are they really counts? Or are they an ordinal
factor? The negative binomial distribution is OK in case of counts.

GEE is valid in case you want the estimate the marginal effects. Use lme4
or similar when you want conditional effects.

Best regards,


ir. Thierry Onkelinx
Statisticus / Statistician

Vlaamse Overheid / Government of Flanders
INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND
FOREST
Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
thierry.onkelinx at inbo.be
Havenlaan 88 bus 73, 1000 Brussel
www.inbo.be

///////////////////////////////////////////////////////////////////////////////////////////
To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey
///////////////////////////////////////////////////////////////////////////////////////////

<https://www.inbo.be>

2018-03-20 6:58 GMT+01:00 VINCENT KOPPELMANS <vincent.koppelmans at utah.edu>:

  
  
#
Just on the GEE vs GLMM question, this is a really nice clear article explaining the differences and how to choose:

Marginal or conditional regression models for correlated non?normal data?
Muff et al. 2016 MEE.
https://doi.org/10.1111/2041-210X.12623

Remarkably it has been cited only once in 2 years (Google Scholar).

To summarise their recommendations: in most cases you'll want a conditional (GLMM) model.