Poisson mixed models: Non-integer response variable in lmer?

Tue, Mar 15, 2011 6:36 AM

[cc'ing back to r-sig-mixed]

On 03/15/2011 01:37 AM, Daniel Barton wrote:

I get the general point, although I guess in this case you would want
to divide the original data by 10 (mean = 3/10 = var = 30/100) ?

This just seemed against all of my training (even though as I noted

I don't think there's a reference: welcome to the cutting edge ... I
agree (and was almost going to mention) that under other circumstances
(quasi-likelihood estimation), we do almost the equivalent of this
scaling in order to remove overdispersion.  I don't think this will
necessarily work right (I haven't thought it all the way through) with
sampling periods of different lengths/sizes, though.

   Ben Bolker

Best,
Dan Barton

On Mon, Mar 14, 2011 at 6:39 PM, Ben Bolker <bbolker at gmail.com
<mailto:bbolker at gmail.com>> wrote:

    On 11-03-14 07:14 PM, Daniel Barton wrote:

    > Hello,
    >      Thanks to everyone who contributes to this list!  I often

    find random

    > questions I have answered in the archives of this list.
    >
    > My specific question of the moment, a simplified example of what

    I'm doing

    > that I hope illustrates my question...
    >
    >      If we have a poisson-distributed response variable in a mixed

    model

    > such as called by:
    >
    > lmer(amrotot ~ year + (year|route), family=poisson(link=log))
    >
    >     where amrotot is an integer count, year is, well, the year (as

    a linear

    > predictor, not a factor) and route is a sampling unit.  If

    'exposure' varies

    > by route, we can define another model with an offset such as:
    >
    > lmer(amrotot ~ year + (year|route), offset=effort,

    family=poisson(link=log))

    >
    >      this all seems, generally good and fine.  A colleague asked

    me why not

    > use (amrotot/effort) as the response variable, but this of course

    results in

    > a non-integer response variable.  Yet it turns out, lmer (or glm,

    for that

    > matter) will indeed estimate a model using the non-integer

    response variable

    > (amrotot/effort) but gives warnings.  I understand that poisson

    regression

    > assumes a poisson-distributed integer response variable, but I was

    curious

    > about *why* lmer would provide results for non-integer response

    variables

    > such as (amrotot/effort) and if these results are valid or somehow
    > comparable to results where amrotot is the response and effort is

    an offset,

    > with special reference to the confidence intervals of the random

    effects.

    > Using non-integer response variables in poisson regression looks

    and seems

    > wrong to me, but IANA statistician and maybe lmer is doing

    something I don't

    > quite get to make this work.

     It won't work: there's a reason that generalized models are restricted
    to count data.  In particular, in Poisson models the assumption is that
    the (expected) variance is equal to the (expected) mean for any data
    point: if you can scale the data points, then the variance-to-mean
    relationship will change with the units used, something you probably
    don't want.

     e.g. if the sampling period is 1 hour and you have 1 count in the
    sampling period, the mean and variance will both be 1 (unitless); if you
    divide the counts by 60 to get counts per minute, then the variance will
    be scaled to 1/3600 (counts/minute)^2 ....

    You ask why glmer (and glm) lets you do this. It's generally difficult
    to decide if one should prohibit, or just warn about, a practice that
    seems odd. Sometimes there are indeed plausible scenarios (although to
    be honest I can't think of one in this case ...) where someone wants to
    use the software in a way not intended by the designers.  I won't say
    that R is completely consistent in this regard, but overall the
    philosophy of "you are assumed to know what you are doing, we will warn
    you but not stop you if you seem to be doing something silly" is
    reasonable.

     Ben Bolker