Dealing with Overdispersion in Count Data with Mixed Modeling

Thu, Nov 11, 2010 11:43 AM

On 11/11/2010 02:11 PM, David Stainbrook wrote:

[correction: it doesn't implement negative binomial models at all,
although I believe that *in* principle this could be added by using an
additional 'k' parameter that controlled the mean-variance relationship,
in a way analogous to glm.nb() in the MASS package].

and quasi-poisson has issues with the

Yes.

yes, exactly.

That is what I would expect.

  Ben Bolker

Thanks for the help,

..................................................................

David Stainbrook
M.S. Graduate Research Assistant
Pennsylvania Cooperative Fish & Wildlife Research Unit
The Pennsylvania State University

..................................................................


On Mon, Nov 1, 2010 09:53 PM, *Ben Bolker <bbolker at gmail.com>* wrote:

      [cc'ing r-sig-mixed-models: it's best to keep sending replies back to
    the list so they can be archived and others can read them, or offer input]

    On 10-11-01 01:28 PM, David Stainbrook wrote:

    > Ben,
    > 
    > Thanks for your input. I read that article that you suggested and it
    > appears they used SAS and Genstat to do their analysis.

      Yes (although as I said at the time, I wouldn't actually trust the
    methods that they used in Genstat for this problem.  I just think their
    description of the problem is clear).

    > Is it possible
    > to use the Poisson-lognormal model in R or translate the model to this
    > using R and lmer? Another professor mentioned that I may be able to get
    > it to work using a negative binomial model in SAS or ADModel Builder.
    > What do you suggest?

      Yes, you can use the Poisson-lognormal in recent versions of lme4,
    simply by including an individual-level random variable.  You may get
    warnings.
      You could indeed use a negative binomial model in SAS or AD Model
    Builder (in ADMB you could also use the lognormal-Poisson model).

    > Do you have any idea why Doug allowed the lmer function to fit
    > quasipoisson if he doesn't feel that the results will be reliable? I
    > would have trusted my results and wouldn't have had any idea that they
    > might have been unreliable if he had not said that.

      I believe he implemented it a while ago and his opinions have now
    changed. (I agree that it might be a good idea to disable this
    functionality.)

    > Also, do you have any idea how to increase both the default number of
    > function evaluations and iterations with the control statement within
    > the lmer model statement?

...,control=list(maxIter=2000,maxFN=3000),... should work.

* you seem to be tackling a difficult problem. I appreciate that
you're offering full details on your problem (full scripts and data),
but it's going to take someone else at least half an hour (and probably
quite a bit more) to get up to speed on what you're doing and what's not
working; unfortunately, that's more than most anyone has time for,
unless the problem happens to be something very close to their
interests. Unfortunately, you may well need to find local help for this
(your advisor? a friendly stats professor or graduate student?) - <I already exhausted those options, hence contacting you and Doug>
* it's possible, depending on the complexity of your model, that
you're simply trying to fit too complicated a model. You do have a lot
of data points, but some of your covariates may be strongly correlated.
Have you tried:
- seeing if you can successfully fit a subset of the data points
(this could be faster, allowing you to debug quicker)?
- seeing if you can successfully fit a subset of the covariates, or
which covariates or combinations of covariates are problematic?
- seeing if you can successfully fit a non-mixed (GLM) model,
treating 'individual' as a fixed effect?
- simulating data, possibly in a simplified form, to see if you can
get the right answer when you know what it is?
* lme4 is quite finicky about convergence, on the philosophy that it's
better not to give an answer than to give a wrong one.

R does have its advantages, but if you're up to working with SAS or AD
Model Builder I would recommend you also try those approaches -- see if
you run into the same problems. But I would definitely try some of the
trouble-shooting strategies above, first.

good luck,
Ben Bolker

    > Thanks again,
    > 
    > David

    > 
    > On Thu, Oct 28, 2010 04:49 PM, *Ben Bolker <bbolker at gmail.com>*

    wrote:

    > 
    >        My advice would be to use an individual-level random variable
    >     (translating to a lognormal-Poisson model, which is qualitatively
    >     similar to a negative binomial) -- see e.g. Elston et al 2001 for

    >     decent explanation, although you should not necessarily trust the
    >     numeric methods they use ...
    > 
    >      [Elston, D. A., R. Moss, T. Boulinier, C. Arrowsmith, and X. Lambin.
    >     2001. Analysis of Aggregation, a Worked Example: Numbers of Ticks on

Red

    >     Grouse Chicks. Parasitology 122, no. 05: 563-569.
    >     doi:10.1017/S0031182001007740.
    >

    http://journals.cambridge.org/action/displayAbstract?fromPage=online&aid=82701.]

    > 
    > 
    >       (Doug, thanks for the vote of confidence!)
    > 
    >       cheers
    >         Ben
    > 
    > 
    > 
    > 
    >

Dealing with Overdispersion in Count Data with Mixed Modeling

Thread (5 messages)