Skip to content

Mixed model and negative binomial distribution

3 messages · Seth Bigelow, David Atkins

#
Dear list: this is a continuation of a previous question. To recap,after
input from several listserv members, I have been using glmmADMB to model
snag (dead tree) densities before, after, and 5 years after forest (factor
variable 'time'). (I apologize for disregarding the advice to use 'BUGS'
approaches -- I didn't have $ for new books or time for assimilation of new
paradigm). The data sampling was stratified by 'Ecozone' ('East' densities
are much lower than 'West'). Because there are many plots with no snags, I
evaluated use of Poisson vs. Negative binomial distribution by finding mean
& variance for each combination of time & Ecozone. AIC for the NB
distribution was vastly smaller (more favorable) than for the Poisson.
Variance tended to be much higher prior to treatment (because a few plots
have large numbers of snags and loggers are instructed to knock most of
these down and simply retain a few).

My statement is glmmadmb(snag~time+Ecozone + (1|SettingID), data=.,
family="nbinom1"). ('SettingID' is a plot identifier).

So, I have these observations/questions. First, when I experiment with
'family="poisson", AIC for the glmm is only 2 AIC units greater than for the
negative binomial model, yet with AIC estimated for individual cells, AIC
Poisson >> AIC NegBinom. Any idea why this might be? (I've tried
zero-inflation for poisson & NB, doesn't seem to make much difference)

It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).) Here's part of the summary statement:

Coefficients:
            Estimate Std. Error z value Pr(>|z|)   
(Intercept)     -0.764      0.359   -2.13   0.0334 * 
time2             -0.588    0.186   -3.16   0.0016 **
time3            -0.604     0.189   -3.20   0.0014 **
EcozoneW     1.249          0.487    2.57   0.0103 *
(Intercept)       time2       time3    EcozoneW 
      0.466       0.555       0.547       3.488

I have tried the full-interaction model ("time*Ecozone"), although it is not
justified from the standpoint of AIC values. This improves the situation
slightly, although confidence intervals do not indicate a significant
decrease in snags after treatment, contrary to what common sense suggests.
Do I need to somehow be modeling the big decrease in variance between time1
and time2 in order to make sense of these data? Data below.
--Seth W. Bigelow


"SettingID","Ecozone","time","snag"
"0506530270068","E","1",0
"0506530290166","E","1",0
"0506530290181","E","1",0
"0506530300115","E","1",0
"0506530300124","E","1",0
"0506530300136","E","1",2
"0506530380015","E","1",0
"0506530380037","E","1",0
"0506530390006","E","1",0
"0506530390119","E","1",0
"0506530400036","E","1",0
"0506530400049","E","1",0
"050653040280049","E","1",0
"0506530430040","E","1",0
"0506530430047","E","1",0
"0506530430065","E","1",0
"0506530430089","E","1",0
"0506580130060","E","1",2
"0506580130155","E","1",0
"0506580180062","E","1",2
"0506581900343","E","1",2
"0506581900346","E","1",4
"0511511060010","E","1",0
"0511511210049","E","1",0
"0511515070003","E","1",6
"0511515070018","E","1",0
"0511515070034","E","1",6
"0511515070054","E","1",0
"0511515160010","E","1",0
"0511515160013","E","1",4
"0511522070132","E","1",0
"0511522070151","E","1",1
"0511522070152","E","1",2
"0511525010008","E","1",0
"0511525010022","E","1",4
"0511525170005","E","1",4
"0517566200042","E","1",12
"0517566200044","E","1",2
"0517566300087","E","1",0
"0517566300095","E","1",4
"0517566350026","E","1",6
"0517566400015","E","1",8
"0517566400086","E","1",0
"0506512840152","W","1",0
"0506512840186","W","1",0
"0506513810068","W","1",0
"0511526022084","W","1",0
"0511526022085","W","1",0
"0511526030034","W","1",12
"0511526050084","W","1",12
"05115260516B0084","W","1",3
"05115260516G0084","W","1",7
"05115260516H0084","W","1",2
"051152605A0084","W","1",8
"0511526130477","W","1",0
"0511526130543","W","1",5
"0511526130619","W","1",2
"0511526190087","W","1",0
"0511534210306","W","1",0
"0511534210307","W","1",2
"0511534210308","W","1",2
"0511534210312","W","1",10
"0517567400160","W","1",4
"0517567400182","W","1",3
"0506530270068","E","2",0
"0506530290166","E","2",0
"0506530290181","E","2",0
"0506530300115","E","2",0
"0506530300124","E","2",0
"0506530300136","E","2",0
"0506530380015","E","2",0
"0506530380037","E","2",0
"0506530390006","E","2",0
"0506530390119","E","2",0
"0506530400036","E","2",0
"0506530400049","E","2",0
"050653040280049","E","2",0
"0506530430040","E","2",0
"0506530430047","E","2",0
"0506530430065","E","2",0
"0506530430089","E","2",0
"0506580130060","E","2",0
"0506580130155","E","2",0
"0506580180062","E","2",0
"0506581900343","E","2",0
"0506581900346","E","2",0
"0511511060010","E","2",0
"0511511210049","E","2",0
"0511515070003","E","2",4
"0511515070018","E","2",2
"0511515070034","E","2",4
"0511515070054","E","2",0
"0511515160010","E","2",0
"0511515160013","E","2",4
"0511522070132","E","2",0
"0511522070151","E","2",1
"0511522070152","E","2",2
"0511525010008","E","2",0
"0511525010022","E","2",2
"0511525170005","E","2",1
"0517566200042","E","2",2
"0517566200044","E","2",2
"0517566300087","E","2",0
"0517566300095","E","2",0
"0517566350026","E","2",0
"0517566400015","E","2",4
"0517566400086","E","2",0
"0506512840152","W","2",0
"0506512840186","W","2",0
"0506513810068","W","2",0
"0511526022084","W","2",0
"0511526022085","W","2",0
"0511526030034","W","2",4
"0511526050084","W","2",12
"05115260516B0084","W","2",2
"05115260516G0084","W","2",2
"05115260516H0084","W","2",2
"051152605A0084","W","2",2
"0511526130477","W","2",0
"0511526130543","W","2",5
"0511526130619","W","2",2
"0511526190087","W","2",0
"0511534210306","W","2",2
"0511534210307","W","2",0
"0511534210308","W","2",0
"0511534210312","W","2",8
"0517567400160","W","2",2
"0517567400182","W","2",4
"0506530270068","E","3",0
"0506530290166","E","3",0
"0506530290181","E","3",0
"0506530300115","E","3",0
"0506530300124","E","3",0
"0506530300136","E","3",0
"0506530380015","E","3",0
"0506530380037","E","3",0
"0506530390006","E","3",0
"0506530390119","E","3",0
"0506530400036","E","3",0
"0506530400049","E","3",0
"050653040280049","E","3",0
"0506530430040","E","3",2
"0506530430047","E","3",0
"0506530430065","E","3",0
"0506530430089","E","3",0
"0506580130060","E","3",0
"0506580130155","E","3",1
"0506580180062","E","3",0
"0506581900343","E","3",0
"0506581900346","E","3",0
"0511511060010","E","3",0
"0511511210049","E","3",0
"0511515070003","E","3",2
"0511515070018","E","3",0
"0511515070034","E","3",4
"0511515070054","E","3",0
"0511515160010","E","3",0
"0511515160013","E","3",4
"0511522070132","E","3",1
"0511522070151","E","3",0
"0511522070152","E","3",0
"0511525010008","E","3",3
"0511525010022","E","3",2
"0511525170005","E","3",0
"0517566200042","E","3",2
"0517566200044","E","3",0
"0517566300087","E","3",0
"0517566300095","E","3",0
"0517566350026","E","3",0
"0517566400015","E","3",4
"0517566400086","E","3",0
"0506512840152","W","3",0
"0506512840186","W","3",0
"0506513810068","W","3",0
"0511526022084","W","3",0
"0511526022085","W","3",0
"0511526030034","W","3",4
"0511526050084","W","3",10
"05115260516B0084","W","3",3
"05115260516G0084","W","3",0
"05115260516H0084","W","3",6
"051152605A0084","W","3",4
"0511526130477","W","3",0
"0511526130543","W","3",9
"0511526130619","W","3",0
"0511526190087","W","3",0
"0511534210306","W","3",0
"0511534210307","W","3",2
"0511534210308","W","3",0
"0511534210312","W","3",10
"0517567400160","W","3",2
"0517567400182","W","3",3

-----Original Message-----
From: Seth W. Bigelow [mailto:seth at swbigelow.net] 
Sent: Friday, October 05, 2012 6:44 PM
To: 'Highland Statistics Ltd'; 'r-sig-mixed-models at r-project.org'
Subject: RE: [R-sig-ME] Fwd: Re: Mixed model and negative binomial
distribution

Alain, thank you very much for the advice regarding analysis of my (negative
binomial vs. poisson-distributed) snag data, and my warmest congratulations
on your nuptials
-Seth W. Bigelow

-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of Highland
Statistics Ltd
Sent: Friday, October 05, 2012 1:04 PM
To: r-sig-mixed-models at r-project.org
Subject: [R-sig-ME] Fwd: Re: Mixed model and negative binomial distribution




-------- Original Message --------
Subject: 	Re: Mixed model and negative binomial distribution
Date: 	Fri, 05 Oct 2012 12:37:28 +0200
From: 	Highland Statistics Ltd <highstat at highstat.com>
To: 	r-sig-mixed-models at r-project.org <r-sig-mixed-models at r-project.org>



------------------------------

Message: 2
Date: Fri, 5 Oct 2012 04:32:36 +0000 (UTC)
From: Ben Bolker <bbolker at gmail.com>
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Mixed model and negative binomial distribution
Message-ID: <loom.20121005T062854-986 at post.gmane.org>
Content-Type: text/plain; charset=us-ascii

Seth W. Bigelow <seth at ...> writes:
In
-----------
Yesterday, Alain was on a nice beach in the Caribbean enjoying his wedding.
Now he is close to the north pole.


Yes....this sounds MCMC. And code for NB GLMM with AR1 correlation and zero
inflation is indeed in the 2012 book.
However...if the zeros are sequential in time then the correlation and zero
inflation components may fight
for the same information. And the NB distribution may try to take its share
of zeros as well. Better start with
the Poisson version of it....and if a Poisson + correlation + ZIP still
shows problems, then consider the NB.
For irregular time series consider a CAR correlation.

As to the implementation..try WinBUGS, OpenBUGS, JAGS....code is nearly the
same. I recently downloaded STAN...looks promising too.

You may also want to have a look at "Generalized Additive Mixed Model
Analysis via gammSlice" by Pham and Wand.
GAMM is essentially a GLMM.... But I don;t think it can do zero
inflation....but perhaps it does NB GLMM. Not sure.



Alain
2 days later
#
Seth--

It is not clear to me what the disconnect is between the GLMM results 
and what you expect?  For example:

"It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).)"

I fit your model, and here are the results including random-effects:

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)        -0.764      0.359   -2.13   0.0334 *
as.factor(time)2   -0.588      0.186   -3.16   0.0016 **
as.factor(time)3   -0.604      0.189   -3.20   0.0014 **
EcozoneW            1.249      0.487    2.57   0.0103 *
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1

Number of observations: total=192, SettingID=64
Random effect variance(s):
Group=SettingID
             Variance StdDev
(Intercept)     2.37   1.54
Negative binomial dispersion parameter: 1.812 (std. err.: 0.33774)

So, we can exponentiate these coefficients to get *conditional* rate 
ratios, or we can include the random-effects to get marginal predictions 
of snag from the model.

Conditional RR:

 > cbind(exp(fixef(my.admb)))
                       [,1]
(Intercept)      0.4660185
as.factor(time)2 0.5551761
as.factor(time)3 0.5467793
EcozoneW         3.4879006

Thus, conditional on the random-effects, there is a 45% decrease in 
snags at time 2.  That seems reasonable given the data, I think?

Marginal predictions:

 > exp(-0.764 + 2.37/2) # marginal intercept
[1] 1.523484

 > exp(-0.764 - 0.588 + 2.37/2) # marginal prediction at time2
[1] 0.8461996

Again, seem sort of reasonable, with the caveat that I am a psychologist 
who studies drug addiction and marital therapy... ;)

If some of your questions are general questions about how to interpret 
the output from Poisson / NB mixed models, then you might take a look at 
a tutorial article that we wrote (with R code), which also points to 
other resources:

http://depts.washington.edu/cshrb/newweb/statstutorials.html

Atkins, D. C., Baldwin, S., Zheng, C., Gallop, R. J., & Neighbors, C. 
(in press). A tutorial on count regression and zero-altered count models 
for longitudinal addictions data.

Granted, the example data come from alcohol focused studies (but 
interestingly enough, it seems the addictions researchers and ecologists 
have similar types of outcomes / models).

If I have missed your question, please let me know.

cheers, Dave
#
Dave, 

I skimmed your tutorial after you first mentioned it but now am going
through line by line. It explains everything beautifully and simply, I
strongly recommend 

"Atkins, D. C., Baldwin, S., Zheng, C., Gallop, R. J., & Neighbors, C. (in
press). A tutorial on count regression and zero-altered count models for
longitudinal addictions data." 

...for anyone seeking a compact introduction to poisson & negative
binomial-based mixed models, regardless of the field they are in.

Thank you for running my snag data and explaining how to interpret
coefficients of mixed-models based on poisson and negative binomial
distributions.

--Seth W. Bigelow



-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org
[mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of David Atkins
Sent: Tuesday, October 23, 2012 1:35 AM
To: r-sig-mixed-models at r-project.org
Subject: Re: [R-sig-ME] Mixed model and negative binomial distribution


Seth--

It is not clear to me what the disconnect is between the GLMM results 
and what you expect?  For example:

"It's tricky to interpret the log-transformed coefficients and associated
confidence intervals, but they don't seem to reflect the decrease in snags
after treatment that I see when I examine the cell means (e.g., mean East
density at time1 (pre-treatment) is 1.7, dropping to 0.7 at time2 (after
treatment).)"

I fit your model, and here are the results including random-effects:

Coefficients:
                  Estimate Std. Error z value Pr(>|z|)
(Intercept)        -0.764      0.359   -2.13   0.0334 *
as.factor(time)2   -0.588      0.186   -3.16   0.0016 **
as.factor(time)3   -0.604      0.189   -3.20   0.0014 **
EcozoneW            1.249      0.487    2.57   0.0103 *
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Number of observations: total=192, SettingID=64
Random effect variance(s):
Group=SettingID
             Variance StdDev
(Intercept)     2.37   1.54
Negative binomial dispersion parameter: 1.812 (std. err.: 0.33774)

So, we can exponentiate these coefficients to get *conditional* rate 
ratios, or we can include the random-effects to get marginal predictions 
of snag from the model.

Conditional RR:

 > cbind(exp(fixef(my.admb)))
                       [,1]
(Intercept)      0.4660185
as.factor(time)2 0.5551761
as.factor(time)3 0.5467793
EcozoneW         3.4879006

Thus, conditional on the random-effects, there is a 45% decrease in 
snags at time 2.  That seems reasonable given the data, I think?

Marginal predictions:

 > exp(-0.764 + 2.37/2) # marginal intercept
[1] 1.523484

 > exp(-0.764 - 0.588 + 2.37/2) # marginal prediction at time2
[1] 0.8461996

Again, seem sort of reasonable, with the caveat that I am a psychologist 
who studies drug addiction and marital therapy... ;)

If some of your questions are general questions about how to interpret 
the output from Poisson / NB mixed models, then you might take a look at 
a tutorial article that we wrote (with R code), which also points to 
other resources:

http://depts.washington.edu/cshrb/newweb/statstutorials.html

Atkins, D. C., Baldwin, S., Zheng, C., Gallop, R. J., & Neighbors, C. 
(in press). A tutorial on count regression and zero-altered count models 
for longitudinal addictions data.

Granted, the example data come from alcohol focused studies (but 
interestingly enough, it seems the addictions researchers and ecologists 
have similar types of outcomes / models).

If I have missed your question, please let me know.

cheers, Dave